Genetic Data

Smarter data that enables precise medical care

Precision medicine, an approach to patient care that is tailored to the individual, is an attainable goal. A key component of precision medicine is a patient’s genes and how their genes influence their risk of disease & progression and their response to treatment. NashBio’s data covers both the phenotypic and the genotypic patient data, enabling the interrogation of these interactions.

An understanding of genes and how they influence and interact with disease is so critical that it materially impacts (2-7x) the success rate of clinical trials¹.

Genetic Data by the Numbers

DNA samples from the BioVU® biobank were assayed on Illumina's Expanded Multi-Ethnic Genotyping Array (MEGAEX). MEGAEX covers 2 million variants and was developed to provide extensive genotyping coverage of European, East Asian and South Asian populations. The genotype data is available in PLINK format.

*NashBio acknowledges the complexities surrounding ancestry, genetic ancestry calculations and the use of ancestry in genomic analysis. Here we classify the population into relative majority 1000 Genomes super-groups.

Imputed Data

Less anomalies for enhanced usability

Imputing genomic datasets can greatly increase the number of represented variants and can enhance genomic analyses, such as genome-wide association studies. NashBio has imputed the 90,000-subject MEGAEX dataset using multiple industry-standard pipelines (Michigan, TOPMed) and multiple reference datasets (1000 Genomes, HRC, TOPMed). The imputed datasets include between 30 million and 300 million variants (15x-150x MEGAEX). Imputed datasets are available in PLINK format for EUR and AFR ancestry cohorts.

    Genomic Sequencing

    NashBio has whole exome sequences (WES) and whole genome sequences (WGS) for a subset of subjects.

    Some of the disease populations include:

    Fatty liver disease (NAFLD/NASH)*
    Type 2 diabetes
    Diabetic nephropathy
    Focal segmental glomerulosclerosis (FSGS)
    Additional sequence data will be available in 2024
    *Sequencing performed prior to adoption of MAFLD/MASH terminology.
    All NashBio data modalities are fully normalized and have been cross-referenced to provide a harmonized data experience. NashBio is committed to patient privacy, to learn more see Unwavering Commitment to Patient Privacy.
    ¹REFs: Nelson et al, Nature Genetics, 2015; 47:56-860, King et al, PLoS Genetics, 2019; 15(12), Estrada et al, Nature Communications, 2021; 12(2224), Wang et al, Nature 2021; 597(7877): 527-532