Collaboratory presents opportunities, challenges for EHR phenotyping efforts

September 6, 2013 - The new NIH Health Care Systems Research Collaboratory represents a novel opportunity to conduct pragmatic clinical trials within the context of large health care systems. Among other things, the Collaboratory allows investigators to use electronic health records (EHRs) to identify clinical conditions that meet the needs of research planning and protocols.

These possibilities are the subject of a recent article by Duke’s Rachel Richesson, PhD, MPH; and Shelley Rusincovitch; Cynthia Kluchar from the DCRI; the DTMI’s Ed Hammond, PhD; Meredith Nahm, PhD; Douglas Wixted, MMCi; Michelle  Smerek, and Robert Califf, MD; and colleagues from Group Health Research Institute, Kaiser Permanente Northwest Center for Health Research, the University of Iowa, and the University of Pennsylvania. The article appears in the online version of the Journal of the American Medical Informatics Association.

The use of EHR data to describe clinical characteristics, events, and service patterns for specific patient populations is known as EHR-based phenotyping. In the context of clinical trials, EHR phenotyping uses data captured in the delivery of health care to identify individuals or populations with conditions or events relevant to interventional, observational, prospective, or retrospective studies. Researchers must use explicitly defined queries that can be applied across different data sets to locate these individuals and populations. However, because the way in which health care providers collect patient information in EHRs is not standardized, designing phenotype definitions is an often difficult and time-intensive process.

The Collaboratory’s Phenotype, Data Standards, and Data Quality (PSQ) Core was established to address these challenges and identify best practices for using EHR phenotypes in research applications. These applications include identifying patients for recruitment of prospective trials; describing patient cohorts for analysis of existing data for comparative effectiveness or health services research; presenting baseline characteristics to describe research populations by demographics, clinical features, and comorbidities for clinical trials; presenting primary outcomes to test the trial hypothesis; and implementing tools for health care providers that are embedded within EHR systems.

As it moves forward with its work, the PSQ Core will have to consider a number of issues, including data quality, completeness, and accuracy. Its biggest challenge will be ensuring consistent collection of relevant data by identifying defined study populations and their features (genetic, biological, psychological, and social), characterizing disease in terms of progression and patient impact, and unambiguously identifying procedures and treatments.

The Collaboratory’s efforts in confronting these challenges, the researchers conclude, will play a vital role in advancing EHR phenotyping in the world of pragmatic clinical trials.