Statins remain one of the most effective tools for reducing cardiovascular risk, yet real-world adherence is shaped by patient-level barriers that are difficult to identify and study at scale. Intolerance, contraindications, and patient deferral are often documented only in the free-text narrative of clinical notes—where critical context about side effects, patient preferences, and nuanced clinical reasoning are missed in systematic analyses without advanced informatics solutions.

A new study from Vanderbilt University Medical Center (VUMC), published February 9, 2026, showcases exactly how artificial intelligence can help unlock these hidden signals. Researchers developed a novel three-part AI framework that analyzed 197,761 clinical notes from 47,192 adult patients seen at Vanderbilt Health over a single month. The results were striking: the model identified documented statin intolerance in 6.4% of patients, contraindications in 0.7%, and patient deferral of statin therapy in 2.9%.

The framework combined a highly sensitive rule-based natural language processing (NLP) filter (100% sensitivity), an LLM-based refinement filter using OpenAI’s ChatGPT-4o (97.3% specificity), and a final LLM-based multicategory classifier. Computation costs were minimal, making the approach scalable for real-time clinical decision support.

Led by Siru Liu, PhD, and Adam Wright, PhD, in the Department of Biomedical Informatics, the study highlights a critical truth in precision medicine: the richest insights into treatment response often reside in unstructured text. As the study authors note, this information is often not captured in structured clinical data fields and may remain inaccessible without analysis of clinical notes.

At NashBio, a wholly owned subsidiary of VUMC, we are uniquely positioned to turn these kinds of breakthroughs into scalable, partner-ready solutions. With direct access to VUMC’s vast de-identified clinical notes repository – part of a synthetic derivative EHR covering 4.1 million unique patients and a median 4.4 years of longitudinal data – NashBio can systematically extract and analyze note-based insights to identify treatment response patterns that structured data alone does not capture.

NashBio’s Clinical Data and Analytic Capabilities: From Unstructured Text to Structured Insights

NashBio’s Clinical Data product is the backbone of our offerings. It includes structured elements (diagnoses, medications, labs, procedures) and features derived from unstructured clinical notes to deliver high-value, context-rich information. Particularly relevant to cardiovascular research, our extractions include:

  • Key cardiovascular feature extracts such as New York Heart Association (NYHA) Functional Classification scores for heart failure, ejection fraction classification, echocardiogram findings, and electrocardiogram results.
  • Detailed documentation of medication intolerance, side effects, patient preferences, contraindications, and reasons for treatment discontinuation or deferral — critical signals frequently found only in free-text clinical notes.
  • Objective findings from diagnostic reports that refine phenotypic understanding and support deeper insights into treatment effectiveness and safety.

NashBio uses AI-enabled advanced analytics to convert unstructured notes into structured, analysis-ready data. We help clients generate insights across a range of research use cases, including drug efficacy, safety, and treatment response.

Use Case Spotlight: In one use case, NashBio used structured clinical data plus unstructured notes to develop an algorithm for detecting bullous pemphigoid, where diagnostic confirmation may depend on pathology results, specialist documentation, and other details found in clinical notes. The unstructured notes were essential for validating true cases and training the note-based algorithm, resulting in 94% sensitivity and overall improved performance (per kappa coefficient results) over prior methods based on only ICD coding.

Our blog series further details these capabilities:

Clinical Specialty Products and Treatment Response Analysis

Through our real world data (RWD), genetic data, and advanced analytics offerings, partners can discover gene-treatment interactions at scale. With 353,000 unique DNA samples and over 250,000 whole genomes sequenced in BioVU®, linked to longitudinal EHR including notes, we enable studies that connect phenotypic nuances to genotypic drivers of response or intolerance.

In statin research, for example, intolerance signals extracted from clinical notes can be integrated with genomic data to explore biomarkers that may not be apparent in structured data alone. This approach may help researchers better understand adverse events, refine patient stratification, or generate hypotheses for therapeutic development.

Why NashBio? Unmatched Access + Clinical and Technical Expertise

NashBio combines robust data assets with clinical and technical expertise. Our team includes clinicians, biostatisticians, data scientists, and other domain experts who support cohort design, data curation, and interpretation informed by real-world clinical practice.

Data can be delivered in flexible formats, including the OMOP common data model or curated flat files. We also work with partners to assemble fit-for-purpose, multimodal datasets optimized for their specific use cases—whether training AI models for clinical decision support, building synthetic control arms, or advancing precision medicine programs.

The Broader Impact: Accelerating Discovery and Improving Patient Care

VUMC’s statin study demonstrates the value of accessing information embedded in clinical notes. NashBio scales this capability across large, longitudinal patient populations to support research partners in biopharma, diagnostics, and AI development. By extracting relevant signals from unstructured clinical text, we help clients:

  • Optimize clinical trial design and patient selection
  • Identify novel drug targets and safety signals earlier
  • Develop more accurate predictive algorithms
  • Generate real-world evidence that supports regulatory and payer decisions

With NashBio’s multimodal real-world data curated from longitudinal EHRs and validated by a 20-year legacy with more than 700 research publications, we turn the hidden signals in clinical notes into actionable intelligence.

Partner with NashBio to Unlock Your Next Insight

If your organization is exploring statin optimization, cardiovascular outcomes, or any therapeutic area where unstructured clinical data holds the key, NashBio is ready to support you. Our Clinical Specialty products, such as CS Cardio, together with advanced analytic capabilities, use VUMC-derived data assets and note-derived insights to support deeper clinical research.

Visit www.nashbio.com to learn more about our Clinical Data, Genetic Data, and Solutions offerings, or contact our team to discuss how we can customize a dataset for your needs.