Navigate Up
Sign In
Cores : Phenotypes, Data Standards, and Data Quality
NIH Collaboratory Cores and Working Groups


Phenotypes, Data Standards, and Data Quality

Co-Chairs: W. Ed Hammond, Jr., Meredith Nahm Zozus, Rachel Richesson

NIH Representative: Jerry Sheehan

Members: Monique Anderson, Alan Bauck, Denise Cifelli, Lesley Curtis, John Dickerson, Pedro Gozalo, Beverly Green, Chris Helker, Beverly Khan, Michael Kahn, Cindy Kluchar, Reesa Laws, Melissa Leventhal, John Lynch, Rosemary Madigan, Vincent Mor, George "Holt" Oliver, Jon Puro, Alee Rowley, Shelley Rusincovitch, Greg Simon, Kari Stephens, Erik Van Eaton

Project Manager: Michelle Smerek

The secondary use of electronic health record (EHR) data for clinical research requires not only an understanding of data standards, interoperability, and the influence of workflows, but also the development and implementation of valid approaches for identifying cohorts with clinical conditions. This involves collaboration among clinicians, EHR experts, and informaticians to develop algorithms, or computable phenotypes, for identifying patients with clinical conditions being studied by researchers. There are many ways to identify patients who have been diagnosed with a specific condition, and understanding the pros and cons of the various approaches is essential for using EHRs effectively in pragmatic clinical trials.

Furthermore, comprehensive data characterization and data quality assessment enable investigators to match a research question with data of appropriate quality in order to conduct the research. The Phenotypes, Data Standards, and Data Quality Core is supporting these efforts across the Collaboratory and making tools available to the wider research community.

The Core’s activities include the following:

Develop phenotype definitions

  • Work with the Demonstration Project teams, the NIH, and the investigator community to identify phenotypes of interest

  • Develop and test phenotype algorithms for use within and across projects

  • Employ standard data elements from public repositories that are most likely to be collected in healthcare settings

    • The Common Data Model for the Collaboratory’s Distributed Research Network will be expanded to include required data elements, as necessary

Identify data validation best practices

  • Identify best practices for use of EHR data and disseminate them to participating clinical systems

  • Collect input from clinicians, data experts, and informaticians to identify data capture issues, relationships between data capture and use, and within-system variation

  • Leverage ongoing work on information quality assessment and statistical approaches

  • Consult with each project regarding best practices for identifying and addressing data quality issues before they become a barrier to valid research inferences

Store generalizable definitions and best practices in an accessible format

  • Compile and maintain a library of computable phenotype definitions and algorithms for common and important conditions or characteristics

  • Where data elements are not yet standardized, advance the process and make them available in public registries

Use standards organizations to move these measures into practice

  • Work with standards organizations to improve data collection in health systems and contribute to a learning healthcare system

  • Develop a suite of standards that encompasses what is required and appropriate for a collaborating center

  • Formalize standards through existing, accredited standards-developing organizations

  • Produce implementation guides that define what standards are to be used, what data elements will be exchanged, and what format and coding systems will be used in those exchanges

Recent Presentation


Rachel Richesson, PhD, MPH, Duke University School of Nursing, and Kin Wah Fung, MD, National Library of Medicine, discuss the ICD-10 transition and its implications for pragmatic trials.

Products and Publications

Type 2 Diabetes Mellitus Phenotype Definition Resources and Recommendations 

Race/Ethnicity Data Standard 

Phenotypes Environmental Scan 

Assessing Data Quality for Healthcare Systems Data Used in Clinical Research (Version 1.0) 

Electronic Health Records-Based Phenotyping Living Textbook Chapter 

Sex Data Standard 

Phenotype Literature Search Suggestions 

Richesson RL, et al. J Am Med Inform Assoc 2013 

Richesson RL, et al. J Am Med Inform Assoc 2013


8/14/2015: Grand Rounds Presentation: ICD-10 Transition: Implications for Pragmatic Trials (Video; Slides)

8/20/2014: Data Quality Assessment Presentation at Steering Committee Meeting 

8/19/2014: Phenotypes, Data Standards, and Data Quality Core Presentation at Steering Committee Meeting 

6/27/2014: Grand Rounds Presentation: What Is a Computable Phenotype and Why Do I Care? (Video; Slides)

4/8/2014: AMIA Conference Presentation: Standardized Representation for Electronic Health Record-Driven Phenotypes 

2/25/2014: Phenotypes, Data Standards, and Data Quality Core Presentation at Steering Committee Meeting 

2/24/2014: Table 1 Presentation at Steering Committee Meeting 

12/6/2013: Grand Rounds Presentation: Data Elements: Bridging Clinical and Research Data (Video; Slides)

11/15/2013: Grand Rounds Presentation: Practical Development and Implementation of EHR Phenotypes (Video; Slides)

3/22/2013: Grand Rounds Presentation: Phenotypes, Quality, and Data Elements (Video; Slides)

2/1/2013: Grand Rounds Presentation: Enhancing EHR Data for Research and Learning Healthcare (Video; Slides)


11/11/2013: Hammond Makes Presentation to Health Care Standards Conference 

10/25/2013: Standardizing EHR Research Queries Across Health Systems 

9/6/2013: Collaboratory’s First Official Publication Discusses Opportunities, Challenges for EHR Phenotyping Efforts


6/5/2015: Dr. W. Ed Hammond Discusses the Phenotypes, Data Standards, and Data Quality Core

9/22/2014: Dr. W. Ed Hammond Discusses the Phenotypes, Data Standards, and Data Quality Core 

10/23/2012: Dr. W. Ed Hammond Discusses the Collaboratory and Electronic Phenotypes

   Admin Login | Member Site Login | About Us | Contact Us
Reference in this Web site to any specific commercial products, process, service, manufacturer, or company does not constitute its endorsement or recommendation by the U.S. Government or National Institutes of Health (NIH). NIH is not responsible for the contents of any "off-site" Web page referenced from this server.