Skip to main content

Clinical Genomics

Medical technology concept. DNA. Gene therapy.

Regulatory and Ethical Considerations of Whole Genome Sequencing

By Clinical Genomics, Ethics

As whole genome sequencing (WGS) becomes increasingly accessible, it raises several ethical and regulatory challenges that must be carefully addressed. Key issues include protecting the privacy of genomic data, ensuring proper informed consent, and determining the responsible approach for returning results to patients and research participants.

Data Privacy Concerns

An individual’s genome contains a trove of highly personal information, including predispositions to certain diseases, ancestral origins, and other traits. The potential misuse of this data by employers, insurers, or governments to discriminate against individuals based on their health risks underscores the need for robust security measures and policies to prevent unauthorized access.

There are also concerns about the potential for genetic data to be re-identified, a process that involves matching an individual’s genetic data with their personal information, such as their name or address. Even if names or other identifiers are stripped away, certain DNA sequences are so unique that they could allow for the tracing of individuals through database cross-referencing. Practices like adding noise to genomic data may be needed to further obscure identities while still allowing useful analyses.

Regulations governing genomic data storage, sharing, and usage are still evolving. Efforts are underway to develop best practices and address ongoing challenges in the responsible use of genomic data. In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects health data privacy, but its rules may not squarely address many issues raised by WGS. Other initiatives, such as the Regulatory and Ethics Work Stream through the Global Alliance for Genomics and Health, are actively working towards solutions.

Informed Consent Challenges

A core principle of biomedical research and clinical testing is that individuals must give informed consent before participating. However, informed consent raises challenges in the genomics age when WGS data may be used for a wide range of future studies beyond the initial intended purpose.

Participants need to be made aware that their genomic data could live on indefinitely in databases and be shared and reused in ways that are currently unknown. There are open questions around what constitutes a truly informed consent process—how much detail is required, what reasonable guarantees exist around future uses, and whether a tiered or meta-consent model may be more appropriate. A tiered consent model could involve participants giving consent for different levels of data sharing, while a meta-consent model could involve permitting the use of data in certain circumstances.

For example, the Personal Genome Project asks participants to agree to their WGS data being publicly available online. This approach maximizes the scientific utility of the data but may not align with privacy expectations for many individuals.

Debates Around Returning Results

As WGS becomes more affordable and widely used, questions emerge about which results should be disclosed to participants and patients whose genomes are sequenced for research or clinical purposes. There are reasonable arguments on both sides of this debate.

On one hand, some argue that all results should be disclosed regardless of their clinical significance to respect individual autonomy. Research has found that most people want to receive results related to serious health conditions if uncovered incidentally through sequencing, even if unrelated to the original reason for testing. They feel they have a right to this potentially life-saving information about health risks they were unaware of.

The opposing perspective argues that only results with clear clinical implications should be disclosed. In many cases, genomic results can be complex, incomplete, and of uncertain clinical relevance. Receiving inconclusive data could potentially cause anxiety and lead to unnecessary follow-up testing and screening procedures. There are also concerns about how to educate and counsel participants properly so they fully understand the nuances of the genomic data they receive.

With no universal guidelines in place, different institutions take divergent approaches. The American College of Medical Genetics recommends returning results only on a specific list of highly actionable genes that could lead to a clear course of action, such as a change in medication or lifestyle. Others favor a more open model of giving patients the choice of what results they receive.

Commercial WGS services are another regulatory gray area. In the United States, the FDA has approved some activities, like carrier screening, but not blanket genomic analysis unattached to specific clinical indications.

Perspectives From the Public

Ultimately, considering public perspectives is essential in shaping policy decisions around WGS, as their genomic data and rights are under consideration. Social science research has found a range of views but an overall desire for policies that prioritize individual choice and control over their data.

A 2021 survey found that 68% of respondents felt people should be able to access and share a copy of their genetic data, but 80% believed sharing an individual’s WGS data without their consent was inappropriate. Regarding the return of results, about 60% wanted to receive findings related to treatable diseases, but only half wanted results on untreatable diseases.

Studies have consistently found higher public trust in policies and governance developed through a transparent process with input from multiple stakeholder groups, including members of the public. Clear communication around risks and benefits is key to building that trust.

As WGS increasingly becomes a part of routine healthcare and research, striking the right balance among personal privacy, autonomy rights, and scientific progress will remain an ongoing challenge for policymakers, the biomedical community, and the public to navigate together.


Global Alliance for Genomics and Health (2015) Genomic Data Policy Scoping Document.

Kaye J, et al. (2022) Public Perspectives Towards Data Sharing and Return of Results from Whole Genome Sequencing. Journal of Medical Ethics, 48:1.

Niemiec, E. & Howard, H.C. (2016) Ethical issues in consumer genome sequencing: Use of consumers’ samples and data. Applied & Translational Genomics, 8:23-30.

Stark, Z. et al. (2019) Understandings of return of genomic results: exploratory study of stakeholders in Australia. European Journal of Human Genetics, 27:1247–1255.


Genomic Research Through Decentralized Trials

Expanding Horizons in Genomic Research Through Decentralized Trials

By Clinical Genomics

Key Takeaways:

  1. Decentralized Randomized Trials: Enable genomic research and clinical studies to be conducted remotely, increasing access and diversity of participants.
  2. Digital Technologies: Leverage tools like telemedicine, mobile apps, and wearable devices to collect data from participants in their homes.
  3. Advantages: Reduced costs, increased enrollment, better representation, and real-world evidence generation.
  4. Challenges: Addressing data privacy/security concerns, protocol adherence, and regulatory acceptance.
  5. Potential Impact: To accelerate genomic medicine through more efficient evidence generation from diverse populations.


The Decentralized Trial Model

Decentralized randomized trials enable genomic research and clinical studies to be conducted remotely with participants in their homes and communities. These trials utilize a suite of tools, including telemedicine, mobile apps, wearable biosensors, and direct-to-participant drug shipments. Genomic data, such as DNA samples for sequencing, are collected via at-home testing kits.


Benefits of Decentralization

This digital approach offers several potential advantages. It provides increased access for populations that may live far from academic medical centers or have mobility limitations. By broadening the geographic reach, decentralized trials help increase diversity and better represent the genetic variation across different racial, ethnic, and socioeconomic groups. Additionally, conducting research in real-world settings generates more naturalistic data on how investigational products perform in participants’ daily lives.


Operational Efficiencies

The decentralized model offers operational efficiencies that can streamline logistics and reduce infrastructure costs associated with traditional brick-and-mortar trial sites. This efficiency can accelerate study timelines and improve enrollment and retention rates by making participation more convenient for volunteers.


Challenges and Barriers

While promising, the adoption of decentralized trials faces challenges, particularly around data privacy and security. Robust protocols must protect participant confidentiality, and ensuring adherence can be more difficult remotely compared to in-clinic visits. Regulators will require thorough evidence to prove that decentralized approaches reliably achieve research objectives.


Current Applications in Genomics

Despite these hurdles, many initiatives are utilizing decentralized trials across disease areas. The NIH’s All of Us Research Program and genomic screening studies like CIRCLE employ digital technologies and at-home sample collection. Pharmaceutical companies have launched decentralized trials for genomically-targeted cancer therapies.


The Future of Decentralized Genomic Research

As supportive evidence grows, the decentralized paradigm is poised to transform genomic research. By capitalizing on digital health tools, this model can accelerate precision medicine by enabling more efficient generation of real-world genomic evidence from diverse populations worldwide. Prioritizing the overcoming of barriers is crucial to achieving this goal.




  1. Khozin, S., & Coravos, A. (2019). Decentralized trials in the age of real-world evidence and inclusiveness in clinical investigations. Clinical Pharmacology & Therapeutics, 106(1), 25-27. [Journal article discussing benefits and challenges of decentralized trials]
  2. Fogel, D. B. (2017). Factors associated with clinical trials that fail and opportunities for improving the likelihood of success: A review. Contemporary Clinical Trials Communications, 11, 156-164. [Review analyzing factors impacting clinical trial success rates]
  3. Walter, J., Fried, J., & Haeussler, J. (2020). Decentralized versus Traditional Clinical Trials. Digitrx (white paper). [White paper from digital clinical trials company Digitrx comparing models]
  4. Sommer, C., Zuckerman, B., Moore, K., & Applicators, P. T. (2020). Decentralized clinical trials: a data-driven introduction. Applied Clinical Trials, 29, 25-27. [Article providing data-driven overview of decentralized trials]
  5. Nabizadeh, S., Phillips, S., Grennan, D., & Walton, S. M. (2022). Decentralized clinical trials: benefits and challenges. Therapeutic Innovation & Regulatory Science, 56(4), 537-548. [Recent journal article analyzing benefits and challenges]

Unveiling the Secrets of the Dark Genome: A Journey into the Hidden Depths of Human DNA

By Clinical Genomics, Healthcare Data

The human genome, comprising both coding and non-coding regions, holds crucial information for understanding biological processes and disease mechanisms. While only two percent of the human genome encodes proteins, the remaining 98 percent, often referred to as the “dark genome,” has long been a mystery. Researchers once thought that the dark genome primarily consisted of “junk” DNA, a term used to describe non-coding regions of DNA that were believed to have no functional purpose. However, recent advancements in genomic research have shed light on the regulatory functions of non-coding DNA, challenging the traditional view of this genomic “junk” (1). These non-coding regions have gained attention for their regulatory roles in gene expression and disease development.

Redefining the Role of Non-Coding Regions

Once dismissed as genetic material of no importance, these non-coding regions are now gaining recognition for their pivotal role in regulating gene expression. The dark genome regulates gene expression through various mechanisms, including the modulation of protein-coding genes by non-coding RNAs. These molecules act as conductors in the cellular orchestra, coordinating responses to environmental cues, modulating disease processes, and maintaining genomic stability. Dysregulation of non-coding RNAs has been implicated in various diseases, including cancer, cardiovascular disorders, and neurological conditions (2).
The non-coding regions of DNA contain various elements and sequences that do not directly encode proteins. These include:


1. Regulatory elements: Sequences that control the activity of genes, such as promoters, enhancers, silencers, and insulators.
2. Repeat sequences: DNA portions repeated multiple times throughout the genome, including short tandem repeats (microsatellites) and longer repetitive sequences.
3. Intergenic regions: Spaces between genes that contain no coding sequences.
4. Non-coding RNAs (ncRNAs): RNA molecules that are transcribed from non-coding regions and play various regulatory roles in the cell, such as microRNAs (miRNAs), long non-coding RNAs, and transfer RNAs.
5. Pseudogenes: Non-functional copies of genes that have lost their protein-coding ability through mutations.
6. Introns: Non-coding segments within genes that are removed during RNA splicing, allowing exons to join together to form mature messenger RNA (mRNA).
7. Telomeres and centromeres: Specialized non-coding DNA sequences found at the ends and centers of chromosomes, respectively, with essential roles in chromosome stability and replication.
8. Heterochromatin: Regions of tightly packed chromatin associated with gene silencing and chromosome structure.


Genetic Diversity in Non-Coding Regions

Genetic diversity within these regions of the genome significantly influences biological function through various mechanisms. Non-coding RNAs within the dark genome play diverse roles in gene regulation and cellular function, with genetic variation impacting their sequence or expression, leading to dysregulation of target genes and perturbation of biological pathways (2). Genetic diversity within the dark genome contributes to evolutionary dynamics by providing raw material for adaptation and speciation, variations in non-coding regions influencing phenotypic traits, reproductive success, and population fitness, thereby shaping genetic diversity over time. Understanding the functional significance of genetic diversity in these regions is crucial for understanding the complexities of genome biology and its implications for health and disease (2).


The dark genome’s regulatory functions significantly affect disease development and progression. Alterations in non-coding DNA sequences have been linked to cancer, developmental disorders, and other chronic illnesses. Understanding the role of the dark genome in disease pathogenesis provides new opportunities for targeted therapies and precision medicine approaches. Traditionally, researchers focused on targeting proteins to combat neoplastic conditions. However, growing evidence suggests that disrupting non-coding RNAs could be a game-changer in cancer treatment. Pharmaceutical companies are developing therapies that target specific non-coding RNAs associated with tumor growth and progression (3).


Leveraging Clinical Data for Genomic Insights

So, how do we unravel the mysteries of the dark genome? One promising approach is to leverage clinical data derived from electronic health records (EHRs). By combining genomic information with clinical data, researchers can efficiently uncover patterns and correlations that might otherwise go unnoticed or require extensive resources to acquire. Integrating clinical data from EHRs with non-coding genomic information obtained through whole genome sequencing enables actionable steps in clinical research and drug discovery. This integration aids researchers in comprehending the intricate relationship between genetic predisposition and environmental factors, encompassing comorbidities, medication usage, and lifestyle habits that might impact disease susceptibility and progression. This approach holds the potential to revolutionize genomics research, accelerating the pace of discovery and bringing us closer to personalized medicine (4).
As we delve deeper into the dark genome, we’re not just exploring genetic code—we’re unraveling the story of human biology and health. The journey may be challenging, but the rewards are boundless and will transform the future of research and healthcare.



1. Blaxter, M., et al. (2010). Revealing the Dark Matter of the Genome.

2. Zhang, X.,et al. (2020). Illuminating the noncoding genome in cancer.

3. Villar, D.,et al. (2020). The contribution of non-coding regulatory elements to cardiovascular disease.

4. Kullo, I. J.,et al. (2010). Leveraging informatics for genetic studies: Use of the electronic medical record to enable a genome-wide association study of peripheral arterial disease.


Understanding the Dark Genome: Insights for Therapeutic Breakthroughs

By Clinical Genomics, Healthcare Data

Emerging evidence suggests that targeting the dark genome, particularly non-coding RNAs, holds promise for developing novel therapeutics. By modulating gene expression patterns, these approaches aim to restore cellular homeostasis and mitigate disease progression. Ongoing research efforts are focused on identifying disease-specific non-coding RNA signatures and developing targeted interventions for various disorders (1). We will take a look at some specific examples within different therapeutic areas that have proved to be impacted by the research of non-coding regions.



Alzheimer’s disease, a complex neurodegenerative disorder marked by cognitive decline, memory loss, and behavioral changes, exhibits significant clinical variation despite common features. Onset can occur early (before age 65 years) or late (after age 65), a phenomenon influenced by both genetics and environment. The age of onset of symptoms can also vary, with some individuals experiencing symptoms in their 40s while others may not manifest them until their 70s. Cognitive decline further demonstrates this variability, with some individuals experiencing gradual deterioration over years while others undergo more rapid decline. Various factors, including health, genetics, and lifestyle, contribute to this heterogeneity. Additionally, comorbidities such as cardiovascular disease have been shown to expedite cognitive decline. Although Alzheimer’s is characterized by the accumulation of plaques in the brain, the severity of symptoms does not always correlate with the presence of plaques alone. Other factors, such as inflammation and neuronal loss, also play significant roles in symptom manifestation. While early-onset Alzheimer’s exhibits strong genetic links, late-onset cases are more multifaceted. Although the APOE ε4 allele is a well-known risk factor, not all carriers develop the disease, highlighting the complexity of genetic influences on disease risk and progression. These variations emphasize the need for personalized approaches to diagnosis and treatment.


Understanding the etiology of Alzheimer’s disease poses a significant challenge, particularly given its polygenic nature and the presence of non-coding genetic variants. These non-coding variants can modulate gene expression by influencing miRNA binding and altering chromatin states within enhancers, affecting brain gene expression (2). Transcriptome-wide association studies have emerged to explore the genetic links between disease risk and gene expression, revealing involvement of Alzheimer’s disease-associated loci in immune system pathways crucial for neuroinflammation and β-amyloid clearance. While Alzheimer’s disease risk variants often reside in non-coding regions, their functional impact on gene regulation remains incompletely understood. Integrating whole-genome sequencing data with phenotyping analyses will offer insights into the molecular mechanisms responsible for Alzheimer’s susceptibility and drive the efforts to understand the complex pathogenesis of Alzheimer’s disease (2).



Similarly, genetic variation is found in the clinical presentation, diagnosis, and treatment of long QT syndrome (LQTS). LQTS is a cardiac disorder characterized by prolongation of the QT interval on ECG, which can predispose individuals to life-threatening arrhythmias, particularly torsades de pointes, and sudden cardiac death. Clinical manifestations of LQTS can vary widely, ranging from asymptomatic individuals to those experiencing syncope, seizures, or sudden cardiac arrest. Some individuals may have symptoms triggered by specific factors, such as physical exertion or emotional stress, while others may experience symptoms without any identifiable triggers. LQTS can present as a congenital condition, typically caused by mutations in genes encoding cardiac ion channels or associated proteins involved in cardiac repolarization, such as KCNQ1, KCNH2, and SCN5A (3). However, acquired forms of LQTS can also occur due to medications, electrolyte imbalances, or other underlying medical conditions. The clinical presentation of LQTS can be influenced by various factors, including the specific genetic mutation involved, environmental triggers, and individual differences in cardiac physiology. Genetic testing plays a crucial role in diagnosing LQTS and identifying at-risk family members, allowing for early intervention and management strategies such as beta-blockers, implantable cardioverter-defibrillators, and lifestyle modifications to reduce the risk of life-threatening arrhythmias and sudden cardiac death.


The diverse symptoms seen in individuals with LQTS, even among those with the same genetic mutations, have prompted investigations into other genes that might influence the severity or presentation of the condition. These studies, including research on non-coding genetic variations, aim to explain why some individuals with LQTS are more prone to life-threatening arrhythmias. For example, a study in a specific South African population revealed that certain non-coding variants in the NOS1AP gene are associated with a higher risk of severe arrhythmias in LQTS patients, and these variants have also been linked to an increased risk of drug-induced LQTS (4). Conversely, another study identified a different non-coding variant in the KCNQ1 gene that seems to lower the risk of arrhythmias in individuals with LQTS (5). Additionally, research using induced pluripotent stem cells from a family with LQTS found specific genetic mutations that either protect against or exacerbate the condition. These findings highlight the existence of genes that can either predispose individuals to LQTS or offer protection against its effects. Recent studies have shown that assessing multiple genetic factors together could help predict an individual’s risk of developing LQTS and guide their clinical management more effectively.



With the growing understanding of non-coding alterations in different types of cancers and their precise roles in disrupting gene regulation and tumor development, researchers are exploring them as potential new indicators for detecting, classifying, and tracking cancer progression. For instance, various ways in which the MYC gene becomes active in different tissues, like gene duplications or changes in enhancers, are being studied as markers for diagnosing cancer, as MYC activation is a common feature in many cancer types (6). Similarly, duplications of enhancers associated with the AR gene have been linked to advanced prostate cancer, offering a new marker for tracking its progression (7). Advances in detecting mutations in the TERT promoter gene have improved early detection of glioblastomas (8). Compared to changes in the DNA code, alterations in non-coding regions are more widespread and specific to certain tissues and cancer types, making them potentially more reliable markers. Methods like analyzing DNA methylation patterns and nucleosome occupancy in circulating free DNA hold promise for noninvasive cancer detection and classification. These new biomarkers could complement existing ones based on changes in the coding regions of cancer genes.



Through advanced genomic techniques, specific non-coding DNA variants associated with inflammatory bowel disease (IBD) susceptibility and disease severity have been identified. These variants disrupt gene regulatory mechanisms, leading to dysregulated expression of genes involved in immune response and inflammation (9). Genetic variations within enhancers and promoters can disrupt the binding of transcription factors or alter chromatin structure, leading to dysregulated gene expression patterns associated with IBD phenotypes. Through genome-wide association studies and functional genomic approaches, researchers have identified specific non-coding DNA variants that are significantly enriched in IBD patients compared to healthy individuals (9). Variants in enhancer regions associated with genes involved in innate and adaptive immunity, mucosal barrier function, and cytokine signaling have been implicated in IBD pathogenesis. Understanding the functional consequences of these variants provides valuable insights into the molecular mechanisms driving IBD development and progression (9).


New therapeutic approaches for IBD involve targeting histone modifiers and key regulators within IBD networks. However, a potential challenge with this approach is that these compounds may affect tissues beyond those affected by the disease. Despite this concern, the predictive value of IBD-associated single nucleotide polymorphisms (SNPs) regarding the pathogenic cell types could guide the development of targeted therapeutics delivered to specific cells, although adverse effects may occur, similar to other therapies targeting general processes like immune modulation and chemotherapy. Ongoing clinical trials are evaluating the efficacy and adverse effects of these potential new compounds, with outcomes likely relevant for IBD treatment (9).


Genome editing technologies, epigenetic modulators, and RNA-based therapies offer promising avenues for selectively modulating gene expression and alleviating inflammation in IBD patients. Understanding the functional consequences of sequence variations in DNA regulatory elements provides valuable insights into IBD pathophysiology and facilitates the development of personalized treatment strategies tailored to individual patients (9).


The exploration of the dark genome, particularly non-coding RNAs, presents a promising frontier in the search for new therapies. Manipulating gene expression patterns hold the potential to restore cellular balance and halt disease progression. Ongoing research provides more insights into disease-specific non-coding RNA signatures, the prospect of targeted interventions for multiple therapeutic areas seems more possible than ever. Genome editing technologies and RNA-based therapies continue to evolve, the promise of precision medicine holds the potential to revolutionize patient care, offering hope for improved outcomes and quality of life. Exploring the mysteries of the dark genome holds a lot of promise in guiding the research for groundbreaking medical advances that will change modern medicine.



1. Zhang, X.,et al. (2020). Illuminating the noncoding genome in cancer.

2. Novikova, G.,et al. (2021). Beyond association: linking non-coding genetic variation to Alzheimer’s disease risk.

3. Giudicessi, J. R., Ackerman, M. J. (2013). Genotype- and phenotype-guided management of congenital long QT syndrome.

4. Crotti, L., et al. (2009). NOS1AP is a genetic modifier of the long-QT syndrome.

5. Duchatelet, S., et al. (2013). Identification of a KCNQ1 Polymorphism Acting as a Protective Modifier Against Arrhythmic Risk in Long-QT Syndrome.

6. Kalkat, M.,et al. (2017). MYC deregulation in primary human cancers.

7. Ku, S. Y.,et al. (2019). Towards precision oncology in advanced prostate cancer.

8. Hasanau, T., et al. (2022). Detection of TERT promoter mutations as a prognostic biomarker in gliomas: Methodology, prospects, and advances.

9. Meddens, C. A., et al. (2019). Non-coding DNA in IBD: from sequence variation in DNA regulatory elements to novel therapeutic potential.

Decrypting the Non-Coding Genome: Unlocking Disease Insights

By Clinical Genomics

Key Takeaways:

  • Regulatory elements in non-coding DNA regions play critical roles in controlling gene expression and biological processes.
  • Misregulation of enhancers, promoters, and other non-coding elements contributes to the pathogenesis of various diseases, including rheumatoid arthritis, coronary artery disease, and melanoma.
  • Integrating whole genome sequencing with clinical phenotypic information from electronic health records can uncover new insights into disease mechanisms driven by non-coding variants.
  • Elucidating the roles of regulatory elements will facilitate development of more precise therapeutics and diagnostics tailored to individuals.


The quest to decipher the intricate underpinnings of human disease has long fixated on protein-coding genes. However, a rapidly burgeoning field is shining a spotlight on the proverbial dark matter of the genome – the vast, enigmatic expanse of non-coding DNA regions harboring regulatory elements that orchestrate gene expression. Enhancers, promoters, insulators, and other regulatory sequences exert exquisite control over when, where, and to what extent genes are expressed, shaping the symphony of cellular processes that sustain life. Yet, when these regulatory elements go awry, the consequences can be catastrophic, manifesting as devastating diseases that have long confounded researchers and clinicians alike.


Rheumatoid Arthritis and Non-Coding Dysregulation 

Rheumatoid arthritis (RA), characterized by excruciating joint inflammation and debilitating pain, exemplifies the profound impact of dysregulated non-coding elements. Enhancers and promoters governing the expression of pivotal inflammatory genes, such as those encoding the cytokines TNF-alpha and IL-6, have been implicated in the pathogenesis of RA. Mutations or epigenetic modifications in these regulatory regions can trigger aberrant cytokine production, fueling the vicious cycle of autoimmune attack and joint destruction that defines this debilitating condition.


Coronary Artery Disease and Lipid Metabolism 

Another striking example is coronary artery disease (CAD), a leading cause of mortality worldwide. The insidious buildup of atherosclerotic plaques within the coronary arteries, narrowing these vital conduits and heightening the risk of heart attacks, is intimately linked to disruptions in lipid metabolism and inflammation. Regulatory elements controlling the expression of genes such as PCSK9, a master regulator of cholesterol homeostasis, have been implicated in CAD etiology. Variants within these non-coding regions can dysregulate PCSK9, precipitating dyslipidemia and promoting the relentless progression of atherosclerosis.


Melanoma and Telomerase Regulation 

Melanoma, a deadly form of skin cancer arising from pigment-producing melanocytes, further underscores the profound influence of non-coding regulatory aberrations. Researchers have identified mutations in the TERT gene’s promoter region in a significant proportion of melanoma cases. These mutations lead to enhanced telomerase activity, conferring cellular immortality and fueling the unchecked proliferation that characterizes this aggressive malignancy.


The Tip of the Iceberg 

While these examples illuminate the pivotal roles of non-coding regulatory elements in disease pathogenesis, they represent merely the tip of the iceberg. The majority of disease-associated variants identified through genome-wide association studies reside within non-coding regions, and their functional implications are largely unexplored. However, interpreting the functional significance of these complex regions is challenging due to their diverse roles and interactions. This underscores the pressing need to delve deeper into the intricate regulatory landscapes governing gene expression and to elucidate how non-coding variants contribute to disease susceptibility, progression, and severity.


Harnessing Genomics and Clinical Data 

Fortunately, the advent of transformative technologies is ushering in a new era of scientific discovery. For example, integrating whole genome sequencing data with clinical phenotypic information from electronic health records (EHRs) provides the means to uncover insights into disease mechanisms driven by non-coding variants.


Envisioning a Future of Precision Medicine 

Imagine a future where an individual’s genomic information, meticulously annotated with regulatory element annotations, is seamlessly integrated with their EHR to capture a rich tapestry of clinical manifestations, treatment responses, and diagnostic imaging data. This convergence of multimodal data streams would empower researchers and clinicians to unravel the intricate interplay between non-coding regulatory variants and disease phenotypes, illuminating novel etiological factors that have long evaded detection.


Transformative Diagnostics and Therapeutics 

By decoding the enigmatic language of non-coding regulatory elements and their misregulation in disease, we stand to gain profound insights that will catalyze the development of groundbreaking diagnostics and therapeutics. For instance, personalized genetic screens could assess an individual’s risk of developing specific diseases based on their unique constellation of non-coding variants and enable early intervention. Similarly, tailored therapeutic interventions that precisely modulate the activity of dysregulated enhancers or promoters could restore homeostasis and alleviate disease burden, offering more effective and targeted treatments.


Interdisciplinary Collaboration 

As we embark on this exhilarating journey, fostering collaborations between geneticists, bioinformaticians, clinicians, and pharmaceutical partners will be paramount. By aligning interdisciplinary expertise and leveraging cutting-edge technologies, we can unlock the transformative potential of non-coding regulatory elements, ushering in a new era of precision medicine that promises to redefine our understanding and treatment of human disease.



  1. Hnisz, D., et al. (2013). Super-enhancers in the control of cell identity and disease. Cell, 155(4), 934-947.
  2. Tak, P. P., & Firestein, G. S. (2001). NF-κB: a key role in inflammatory diseases. The Journal of clinical investigation, 107(1), 7-11.
  3. Abifadel, M., et al. (2003). Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nature genetics, 34(2), 154-156.
  4. Huang, F. W., et al. (2013). Highly recurrent TERT promoter mutations in human melanoma. Science, 339(6122), 957-959.
  5. Maurano, M. T., et al. (2012). Systematic localization of common disease-associated variation in regulatory DNA. Science, 337(6099), 1190-1195.
human genetics

From Genes to Drugs: The Role of Genetics in Modern R&D

By Clinical Genomics

Key Takeaways:


  • Human genetics research can elucidate mechanisms of disease and help identify new drug targets.
  • Studying genetic variants linked to disease risk or drug response helps stratify patients and inform clinical trials.
  • Genomic data enables the development of precision medicines targeted to patients’ genetic profiles.
  • Pharmocogenomics and genetic screening guides optimal drug usage and minimizes adverse reactions.
  • Advancements in genetic analysis technologies are enabling more rapid and expansive use of genomic data in drug R&D.


The Value of Human Genetics in Drug R&D

Developing new drugs is a lengthy and expensive process with a high failure rate. On average, it takes 10-15 years and over $1 billion to bring a new drug to market. The pharmaceutical industry is looking to human genetics research to improve R&D efficiency, success rates and the personalized utility of new medicines.


Understanding the genetic factors underlying diseases can point the way to new drug targets. Identifying genetic variants linked to disease risk helps elucidate biological pathways involved. Druggable targets can then be identified to modulate relevant pathways and processes. Genetics also helps establish causal mechanisms to avoid spurious associations.


Pharmacogenomics focuses on how genetic variability affects drug response. It enables matching patients to treatments according to genotype to maximize effectiveness and avoid adverse reactions. Testing for pharmacogenomic biomarkers can guide dosing, or indicate alternate treatments when genetics point to likely non-response.


Genetic screening also aids patient stratification and clinical trial optimization. Enriching trial participant selection for those most likely to respond or exhibit a clinical effect improves statistical power with smaller sample sizes. Genetic variables allow better control for confounding factors. Pharmacogenomic testing of participants also helps explain differential responses.


Studying rare genetic variants with large effects (“genetic supermodels”) provides another window into disease biology. The study of extreme genotypes helps unravel mechanisms and identify new targets.


Once a drug is developed, genetics continues to inform optimal use. Screening programs using pharmacogenomic biomarkers guide treatment choices and minimize risks. Genetics also aids mechanistic understanding of how therapies work, illuminating additional applications and opportunities.


The plummeting costs of genome sequencing and advances in big data analytics are enabling more extensive use of human genetic data. Pete Hulick, lead for molecular biology at Eli Lilly, described human genetics as “intersecting with everything that we do” in drug R&D.


Applications in Discovery Research

Early in the R&D process, human genetic insights can point the way to promising disease targets. Scientists look for associations between genetic variants, such as single nucleotide polymorphisms (SNPs), and disease risk. Genome-wide association studies (GWAS) uncover SNP differences between disease and control cohorts. Significant associations indicate genes and biological pathways involved in the disease that may be amenable to pharmacological intervention.

Once potential targets are identified, downstream lab research explores how to modulate them. Developing a drug is an iterative process, but human genetics provides clues on where to start.

Genetics also offers validation when biological hypotheses emerge from other experiments. Confirming that tweaking a gene or pathway affects disease risk strengthens the case for pursuing it as a drug target.


Patient Stratification & Clinical Trials

Patient heterogeneity is a major obstacle in clinical trials. Varied treatment responses lower statistical power and necessitate larger trial sizes. Genetic analysis enables better patient stratification to minimize heterogeneity and identify relevant subgroups.


For example, the cystic fibrosis drug Kalydeco works for patients with a particular CFTR gene mutation. Prescreening patients’ genetics enables targeted trial recruitment. Similar approaches minimize heterogeneity in cancer trials by selecting patients with tumors exhibiting specific mutations.


Genotyping trial participants helps explain differential responses and may uncover additional genotype-specific effects. Genetic associations can also point to new indications for the drug mechanism.


Precision Medicine

The emergence of targeted precision therapies relies directly on human genetics. Cancer treatments like Herceptin and Gleevec target tumors with specific genomic variants. HIV drugs are tailored to individual viral genotypes. Gene therapies introduce corrected genes to compensate for defective inherited genes.


This personalized approach promises greater efficacy for those most likely to respond. By targeting drugs based on genetic profiles, precision medicine seeks to maximize benefit while minimizing unnecessary treatment.


Pharmacogenomics for Safety & Optimization

Pharmacogenomic testing assesses how genetic variability affects reactions to drugs. It can identify patients likely to experience adverse events or suboptimal responses. This enables selecting safer treatments, dosage adjustment or more intense monitoring.


The blood thinner warfarin, for example, demonstrates significant pharmacogenomic effects. Genotyping helps guide ideal dosing to balance effectiveness and bleeding risks. The FDA added pharmacogenomic guidance on warfarin labeling in 2007.


Wider adoption of pharmacogenomic testing has the potential to reduce adverse drug events that represent a significant public health burden. More optimal treatment through genetic guidance also contributes to pharmacoeconomic goals.


Looking Ahead

The expanding use of human genetics is transforming every phase of drug R&D. While challenges remain in interpreting and applying genetic findings, the value in accelerating discovery, precision medicine and optimized therapeutics is evident. Advances in high-throughput genomics, big data analytics    and machine learning will further incorporate human genetics into tomorrow’s medicines.



  • Relling & Evans, Nature Reviews Drug Discovery 2015
  • Roden & Denny, Annual Review of Medicine 2019
  • Genomics England PanelApp Pharmacogenetics Gene Curation Group,NPJ Genomic Medicine 2020
  • Li et al., Nature Reviews Genetics 2020
  • Manolio et al., JAMA 2020
  • Xu et al., Nature Reviews Drug Discovery 2022
human genetics

Human Genetics as a Strategic Imperative to Accelerate Drug Discovery: The Alliance for Genomic Discovery

By Clinical Genomics

Key Takeaways:

  • Pharmaceutical development is high-risk and resource-intensive, with a 90% failure rate in clinical trials, often due to inadequate efficacy, toxicity, drug properties, or commercial viability.
  • Incorporating human genetic evidence doubles drug approval rates, paving the way for innovative therapies and new molecular entities.
  • Techniques like GWASs and PheWAS linking genetic data to phenotypic data enhance drug development by identifying associations between rare alleles and diseases.
  • Published human genetic studies, primarily centered on individuals of European descent, hinder our understanding of genetic diversity and impede the development of new therapies suitable for diverse populations; therefore, establishing study cohorts with under-represented populations is crucial for promoting health equality and identifying novel drug targets based on diverse genetic variants.
  • The Alliance for Genomic Discovery (AGD) aims to reshape drug development by sequencing 250,000 diverse samples, providing a powerful resource for pharmaceutical members to correlate genetic variations with clinical outcomes and, in turn, enabling these companies to better serve a global population.


The Struggle to Discover New Therapies

Discovering and developing pharmaceuticals is a resource-intensive and high-risk endeavor, sometimes spanning 15 years with costs exceeding $2 billion for their approval (Hinkson et al., 2020). Shockingly, about nine out of ten potential therapies, upon progressing to clinical trials, fail before approval (Dowden & Munro, 2019; Sun et al., 2022). The four primary contributors to the staggering 90% failure rate in drug development are inadequate clinical efficacy, unmanageable toxicity, suboptimal drug-like properties and a lack of commercial viability (Dowden & Munro, 2019; Harrison, 2016; Sun et al., 2022). To increase the chances of a drug target passing these critical checkpoints, considerable endeavors can be directed towards incorporating human genetic evidence into drug development.


In the drug development pipeline, all compounds before entering clinical phases must undergo rigorous testing in animal models, providing significant evidence of their potential to treat diseases. However, despite promising results in preclinical studies, the translation of efficacy and safety from animal models to human clinical trials is often elusive. Integrating human genetic evidence into the drug development process has recently emerged as a crucial strategy to navigate this challenge. Drugs grounded in such evidence exhibit a twofold increase in approval rates (Nelson et al., 2015), contributing to a higher prevalence of first-in-class therapies and new molecular entities (NMEs) (King et al., 2019). This not only accelerates the approval process but also streamlines the discovery of more effective and targeted treatments. Leveraging human genetic data empowers researchers with valuable insights into the genetic basis of diseases, facilitating the identification of better drug targets. The substantial presence of genetic evidence in FDA-approved drugs in 2021 (Ochoa et al., 2022) underscores its instrumental role in advancing drug discovery and fostering the emergence of innovative pharmaceutical solutions.


Linking Genetics to Clinical Data for Drug Discovery

To incorporate genetics into therapeutic development, researchers can link the genetic code of an individual to their Electronic Health Records (EHRs). Researchers can use techniques like Genome-wide association studies (GWASs), Phenome-wide association studies (PheWAS), Mendelian Randomization or Loss/Gain-of-Function Variants to discover associations between rare alleles and human disease (Krebs & Milani, 2023). Using these techniques, drugs tailored for Mendelian disorders have achieved notable success in clinical trials and approvals (Heilbron et al., 2021). For instance, the genetic disease Autosomal dominant hypercholesterolemia (ADH) confers an increased risk of coronary artery disease (CAD) through elevated levels of plasmatic low-density lipoprotein (LDL). By linking phenotypic data with genetic data, researchers were able to identify the association of the PCSK9 gene with high LDL levels (Abifadel et al., 2003). This kickstarted a series of studies that culminated in the approval of two monoclonal antibodies that inhibit PCSK9, Repatha (Evolocumab) and Praluent (Alirocumab) (Krebs & Milani, 2023; Robinson et al., 2015) with their treatment reducing the rate of major adverse cardiovascular events by half (Kaddoura et al., 2020). Indeed, therapies derived from these kinds of impactful rare alleles exhibit a 6-7.2 times greater likelihood of receiving approval due to their substantial effect on symptoms (Nelson et al., 2015; King et al., 2019). However, for many prevalent diseases, heritable risk is predominantly associated with numerous common variants, each having smaller individual effect sizes. This intricate genetic landscape complicates the identification of therapeutic targets, making the discovery of new avenues for therapy challenging and necessitating new strategies.


So far, a disproportionate number of published human genetic studies have centered on individuals of European descent (Fatumo et al., 2022). However, this narrow focus restricts our understanding to a limited diversity of alleles and genetic disorders, hindering the development of new therapies. To promote health equality, it’s crucial to establish study cohorts that include under‐represented populations. After all, individuals of European descent represent only a fraction of the total human genetic variation (Heilbron et al., 2021). Diverse cohorts represent unique opportunities for identifying novel drug targets based on genetic variants that are less frequent or even absent in people of European ancestry. Genetic discoveries will have greater discovery power in populations where a disease is more prevalent and, hence, with larger disease cohorts; at the same time, these discoveries will be more relevant and beneficial for these populations.


Founding the Alliance for Genomic Discovery

This need to identify rare genetic variants in diverse patient cohorts has driven the collaboration of NashBio and Illumina Inc. to establish AGD. AGD, comprising eight member organizations—AbbVie, Amgen, AstraZeneca, Bayer, Merck, Bristol Myers Squibb (BMS), GlaxoSmithKline Pharmaceuticals (GSK), and Novo Nordisk (Novo)—aims to expedite therapeutic development through whole-genome sequencing (WGS) 250,000 samples from Vanderbilt University Medical Center’s (VUMC) biobank repository, BioVU®. As the first phase in AGD, deCODE genetics performed WGS on the first 35,000 VUMC samples, primarily made up of DNA from individuals of African ancestry. Moving forward, deCODE/Amgen will sequence the remaining samples for the Alliance members to have access to the resulting data for drug discovery and therapeutic development. The WGS data will then be linked with structured EHR data from NashBio and VUMC, creating a valuable resource for pharmaceutical members to correlate genetic variations with clinical outcomes. To learn more about how AGD aims to accelerate drug discovery and to hear directly from the alliance members, click here.



AGD marks a pivotal step in reshaping drug development, offering a solution to the challenges plaguing the pharmaceutical industry. With a staggering 90% failure rate in clinical trials, the incorporation of human genetic evidence into drug development by AGD aims to increase the approval likelihood of drug targets, fostering the discovery of more effective and targeted treatments. AGD also aims to address the limitations of existing genetic resources and studies. The WGS of 250,000 samples, encompassing diverse populations and linked with structured EHR data, provides pharmaceutical members with a powerful resource. This not only accelerates drug discovery but also facilitates the development of tailored therapies. AGD represents a significant step toward healthcare equality, highlighting the importance of diverse genetic studies in progressing drug discovery for the benefit of all people.



Abifadel, M., Varret, M., Rabès, J.-P., Allard, D., Ouguerram, K., Devillers, M., Cruaud, C., Benjannet, S., Wickham, L., Erlich, D., Derré, A., Villéger, L., Farnier, M., Beucler, I., Bruckert, E., Chambaz, J., Chanu, B., Lecerf, J.-M., Luc, G., … Boileau, C. (2003). Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nature Genetics, 34(2), 154–156.

Dowden, H., & Munro, J. (2019). Trends in clinical success rates and therapeutic focus. Nature Reviews. Drug Discovery, 18(7), 495–496.

Fatumo, S., Chikowore, T., Choudhury, A., Ayub, M., Martin, A. R., & Kuchenbaecker, K. (2022). A roadmap to increase diversity in genomic studies. Nature Medicine, 28(2), 243–250.

Harrison, R. K. (2016). Phase II and phase III failures: 2013-2015. Nature Reviews. Drug Discovery, 15(12), 817–818.

Heilbron, K., Mozaffari, S. V, Vacic, V., Yue, P., Wang, W., Shi, J., Jubb, A. M., Pitts, S. J., & Wang, X. (2021). Advancing drug discovery using the power of the human genome. The Journal of Pathology, 254(4), 418–429.

Hinkson, I. V., Madej, B., & Stahlberg, E. A. (2020). Accelerating Therapeutics for Opportunities in Medicine: A Paradigm Shift in Drug Discovery. Frontiers in Pharmacology, 11.

Kaddoura, R., Orabi, B., & Salam, A. M. (2020). PCSK9 Monoclonal Antibodies: An Overview. Heart Views : The Official Journal of the Gulf Heart Association, 21(2), 97–103.

King, E. A., Davis, J. W., & Degner, J. F. (2019). Are drug targets with genetic support twice as likely to be approved? Revised estimates of the impact of genetic support for drug mechanisms on the probability of drug approval. PLOS Genetics, 15(12), e1008489.

Krebs, K., & Milani, L. (2023). Harnessing the Power of Electronic Health Records and Genomics for Drug Discovery. Annual Review of Pharmacology and Toxicology, 63(1), 65–76.

Nelson, M. R., Tipney, H., Painter, J. L., Shen, J., Nicoletti, P., Shen, Y., Floratos, A., Sham, P. C., Li, M. J., Wang, J., Cardon, L. R., Whittaker, J. C., & Sanseau, P. (2015). The support of human genetic evidence for approved drug indications. Nature Genetics, 47(8), 856–860.

Ochoa, D., Karim, M., Ghoussaini, M., Hulcoop, D. G., McDonagh, E. M., & Dunham, I. (2022). Human genetics evidence supports two-thirds of the 2021 FDA-approved drugs. Nature Reviews. Drug Discovery, 21(8), 551.

Robinson, J. G., Farnier, M., Krempf, M., Bergeron, J., Luc, G., Averna, M., Stroes, E. S., Langslet, G., Raal, F. J., El Shahawy, M., Koren, M. J., Lepor, N. E., Lorenzato, C., Pordy, R., Chaudhari, U., & Kastelein, J. J. P. (2015). Efficacy and Safety of Alirocumab in Reducing Lipids and Cardiovascular Events. New England Journal of Medicine, 372(16), 1489–1499.

Sun, D., Gao, W., Hu, H., & Zhou, S. (2022). Why 90% of clinical drug development fails and how to improve it? Acta Pharmaceutica Sinica. B, 12(7), 3049–3062.

Polygenic risk score

The Role of Polygenic Risk Scores in Clinical Genomics

By Clinical Genomics


We were promised the end to genetic diseases. All we needed to do was unlock the human genome. Unfortunately, life has a way of being more complicated than we expect. It turned out that many genetic disorders are the result of the interplay between multiple genetic factors. This set off the need for improved analytical tools to analyze human genetics that could interrogate the associations of many genetic backgrounds and link them to various diseases. One such technique, the Polygenic Risk Score (PRS), emerged as a powerful tool to quantify the cumulative effects of multiple genetic variants on an individual’s predisposition to a specific disease.

The Evolution of Polygenic Risk Scores

The genesis of PRS can be traced back to the early 2000s when researchers sought to comprehend the collective impact of multiple genetic variants on disease susceptibility. Initially viewed through a biological lens, the focus was on enhancing the prediction of diseases by analyzing subtle genomic variations. Studies concentrated on prevalent yet complex diseases such as diabetes, cardiovascular diseases, and cancer, laying the groundwork for a comprehensive understanding of their genetic architecture.


That was until Dr. Sekar Kathiresan showed that the prediction from a PRS was just as clinically useful as a single variant (Khera et al., 2018). Instead of looking at the percent of people with a PRS in each group (with or without a disease), his group could show a much more obvious effect – the difference in risk for people in the groups with the highest and lowest scores. Then, they could say that there was a huge difference in risk for these two edges of the population.


In the initial stages, PRSs consisted of only the most statistically significant variants from genome-wide association studies. Geneticists often added up the quantity of risk variants without giving them a weight for how much of an impact they had on whether someone would get a disease. Refining these scores led scientists to challenge arbitrary risk cutoffs and advocate for the inclusion of all variants to maximize statistical power (based on the assumption that, on average, variants that have no effect are evenly distributed to appear positively or negatively correlated to the trait). However, proximity of variants on a chromosome presented another challenge. If variants were closer together on a chromosome, they would be less likely to be separated during recombination (Linkage Disequilibrium). This would result in them carrying the signal of something that had a true effect, potentially leading to an overcounting of that signal.


To deal with this, geneticists used tools to remove signals within a specified block unless their correlation with the strongest signal fell below a threshold. One of the first packages, PRSice (Choi & O’Reilly, 2019), used an approach called Pruning and Thresholding. Scientists would choose a block size, say, 200,000 base pairs. A program would go through and slide that block along the genome. If there was more than a single signal in that block, the program would remove (or “prune”) all but the strongest signal unless the variant had a smaller correlation with the strongest signal than the “threshold”. The result was that in a region with many different variants that affected the risk of a disease, but which were still a bit correlated, signal could be lost.


Criticism from biostatisticians prompted a shift towards a Bayesian approach, reducing over-counting while better accounting for partially independent signals. Implementation was challenged by the extensive computational resources needed to update the signal at each genetic location based on linkage disequilibrium of the surrounding SNPs. One program, called PRS-CS (Ge et al., 2019), implemented a method that could apply changes to a whole linkage block at once, addressing both the geneticist demand for a good system that can provide results using the computation tools we have and the biostatistician demand for accuracy and retained information.


Despite these advancements, accuracy challenges persisted, particularly when applying scoring systems across populations with different genetic ancestries. It turned out Linkage Disequilibrium was a pervasive problem. The patterns of Linkage Disequilibrium are different in people with different genetic ancestries. In fact, even statistics about the patterns themselves, like how big an average block size is, are different. Recognizing the need for improvement, ongoing efforts in refining PRSs aim to address these challenges, paving the way for more accurate and reliable applications. As researchers delve deeper into these complexities, the evolving landscape of PRSs continues to shape the future of clinical research.

Polygenic Risk Scores in Clinical Research Settings

To harness the full potential of PRS in clinical practice, a crucial shift is needed—from population-level insights to personalized predictions for individual patients. This transformation involves converting relative risks, which compare individuals across the PRS spectrum with a baseline group, into absolute risks for the specific disease (Lewis & Vassos, 2020). The current emphasis is on identifying individuals with a high genetic predisposition to disease, forming the foundation for effective risk stratification. This information guides decisions related to participation in screening programs, lifestyle modifications, or preventive treatments when deemed suitable.


In practical applications, PRS demonstrates promise in patient populations with a high likelihood of disease. Consider a recent study in an East Asian population, where researchers developed a PRS for Coronary Artery Disease (CAD) using 540 genetic variants (Lu et al., 2022). Tested on 41,271 individuals, the top 20% had a three-fold higher risk of CAD compared to the bottom 20%, with lifetime risks of 15.9% and 5.8%, respectively. Adding PRS to clinical risk assessment slightly improved accuracy. Notably, individuals with intermediate clinical risk and high PRS reached risk levels similar to high clinical risk individuals with intermediate PRS, indicating the potential of PRS to refine risk assessment and identify those requiring targeted interventions for CAD.


Another application of PRS lies in improving screening for individuals with major disease risk alleles (Roberts et al., 2023). A recent breast cancer risk assessment study explored pathogenic variants in high and moderate-risk genes (Gao et al., 2021). Over 95% of BRCA1, BRCA2, and PALB2 carriers had a lifetime breast cancer risk exceeding 20%. Conversely, integrating PRS identified over 30% of CHEK2 and almost half of ATM carriers below the 20% threshold. Indeed, a similar result was found in a separate study when researchers investigated men with high blood levels of prostate-specific antigen (PSA). 


This trend extends to other diseases, such as prostate cancer, where a separate investigation focused on men with elevated levels of prostate-specific antigen (PSA) (Shi et al., 2023). Through the application of PRS, researchers pinpointed over 100 genetic variations linked to increased PSA levels. Ordinarily, such elevated PSA levels would prompt prostate biopsies to assess potential prostate cancer. By incorporating PRS into the screening process, doctors could have accounted for the natural variation in PSA level and prevent unnecessary escalation of clinical care. These two studies suggest that PRS integration into health screening enhances accuracy, preventing unnecessary tests and enabling more personalized risk management.


In the realm of pharmacogenetics, efforts to optimize treatment responses continue. While progress has been made in identifying rare high-risk variants linked to adverse drug events, predicting treatment effectiveness remains challenging. The evolving role of PRS in treatment response is particularly evident in statin use for reducing initial coronary events. In a real-world cohort without prior myocardial infarction, an investigation revealed that statin effectiveness varied based on CHD PRSs, with the highest impact in the high-risk group, intermediate in the intermediate-risk group, and the smallest effect in the low-risk group (Oni-Orisan et al., 2022). Post-hoc analyses like this for therapeutics could potentially allow for more targeted enrollment for clinical trial design, substantially reducing the number of participants needed to demonstrate trial efficacy (Fahed et al., 2022).


As the field of genetics continues to advance, PRSs emerge as a potent tool with the potential to aid clinical research. Validated PRSs show promise in enhancing the design and execution of clinical trials, refining disease screening, and developing personalized treatment strategies to improve the overall health and well-being of patients. However, it’s crucial to acknowledge that the majority of PRS studies heavily rely on biased datasets of European ancestry. To refine and improve PRS, a comprehensive understanding of population genetic traits for people of all backgrounds, such as linkage disequilibrium, is essential. Moving forward, the integration of PRS into clinical applications must prioritize datasets with diverse ancestry to ensure equitable and effective utilization across all patient backgrounds. As research in this field progresses, the incorporation of PRS is poised to become an indispensable tool for expediting the development of safer and more efficacious therapeutics.



Choi, S. W., & O’Reilly, P. F. (2019). PRSice-2: Polygenic Risk Score software for biobank-scale data. GigaScience, 8(7).


Fahed, A. C., Philippakis, A. A., & Khera, A. V. (2022). The potential of polygenic scores to improve cost and efficiency of clinical trials. Nature Communications, 13(1), 2922.


Gao, C., Polley, E. C., Hart, S. N., Huang, H., Hu, C., Gnanaolivu, R., Lilyquist, J., Boddicker, N. J., Na, J., Ambrosone, C. B., Auer, P. L., Bernstein, L., Burnside, E. S., Eliassen, A. H., Gaudet, M. M., Haiman, C., Hunter, D. J., Jacobs, E. J., John, E. M., … Kraft, P. (2021). Risk of Breast Cancer Among Carriers of Pathogenic Variants in Breast Cancer Predisposition Genes Varies by Polygenic Risk Score. Journal of Clinical Oncology : Official Journal of the American Society of Clinical Oncology, 39(23), 2564–2573.


Ge, T., Chen, C.-Y., Ni, Y., Feng, Y.-C. A., & Smoller, J. W. (2019). Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nature Communications, 10(1), 1776.


Khera, A. V., Chaffin, M., Aragam, K. G., Haas, M. E., Roselli, C., Choi, S. H., Natarajan, P., Lander, E. S., Lubitz, S. A., Ellinor, P. T., & Kathiresan, S. (2018). Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature Genetics, 50(9), 1219–1224.


Lewis, C. M., & Vassos, E. (2020). Polygenic risk scores: from research tools to clinical instruments. Genome Medicine, 12(1), 44.


Lu, X., Liu, Z., Cui, Q., Liu, F., Li, J., Niu, X., Shen, C., Hu, D., Huang, K., Chen, J., Xing, X., Zhao, Y., Lu, F., Liu, X., Cao, J., Chen, S., Ma, H., Yu, L., Wu, X., … Gu, D. (2022). A polygenic risk score improves risk stratification of coronary artery disease: a large-scale prospective Chinese cohort study. European Heart Journal, 43(18), 1702–1711.


Oni-Orisan, A., Haldar, T., Cayabyab, M. A. S., Ranatunga, D. K., Hoffmann, T. J., Iribarren, C., Krauss, R. M., & Risch, N. (2022). Polygenic Risk Score and Statin Relative Risk Reduction for Primary Prevention of Myocardial Infarction in a Real-World Population. Clinical Pharmacology and Therapeutics, 112(5), 1070–1078.


Roberts, E., Howell, S., & Evans, D. G. (2023). Polygenic risk scores and breast cancer risk prediction. Breast (Edinburgh, Scotland), 67, 71–77.


Shi, M., Shelley, J. P., Schaffer, K. R., Tosoian, J. J., Bagheri, M., Witte, J. S., Kachuri, L., & Mosley, J. D. (2023). Clinical consequences of a genetic predisposition toward higher benign prostate-specific antigen levels. EBioMedicine, 97, 104838.