Widespread sharing of data from electronic health records and patient-reported outcomes can strengthen the national capacity for conducting cost-effective clinical trials and allow research to be embedded within routine care delivery. While pragmatic clinical trials (PCTs) have been performed for decades, they now can draw on rich sources of clinical and operational data that are continuously fed back to inform research and practice. The Health Care Systems Collaboratory program, initiated by the NIH Common Fund in 2012, engages healthcare systems as partners in discussing and promoting activities, tools, and strategies for supporting active participation in PCTs. The NIH Collaboratory consists of seven demonstration projects, and seven problem-specific working group 'Cores', aimed at leveraging the data captured in heterogeneous 'real-world' environments for research, thereby improving the efficiency, relevance, and generalizability of trials. Here, we introduce the Collaboratory, focusing on its Phenotype, Data Standards, and Data Quality Core, and present early observations from researchers implementing PCTs within large healthcare systems. We also identify gaps in knowledge and present an informatics research agenda that includes identifying methods for the definition and appropriate application of phenotypes in diverse healthcare settings, and methods for validating both the definition and execution of electronic health records based phenotypes.
Standards-based, computable knowledge representations for eligibility criteria are increasingly needed to provide computer-based decision support for automated research participant screening, clinical evidence application, and clinical research knowledge management. We surveyed the literature and identified five aspects of eligibility criteria knowledge representations that contribute to the various research and clinical applications: the intended use of computable eligibility criteria, the classification of eligibility criteria, the expression language for representing eligibility rules, the encoding of eligibility concepts, and the modeling of patient data. We consider three of them (expression language, codification of eligibility concepts, and patient data modeling), to be essential constructs of a formal knowledge representation for eligibility criteria. The requirements for each of the three knowledge constructs vary for different use cases, which therefore should inform the development and choice of the constructs toward cost-effective knowledge representation efforts. We discuss the implications of our findings for standardization efforts toward sharable knowledge representation of eligibility criteria.
Executive SummaryA movement to create a federated global patient registry containing core data and using a standardized vocabulary for as many as 7,000 rare diseases was launched at a workshop,
Further research focused on defining the clinical characteristics of standard diabetes cohorts is important to identify appropriate phenotype definitions for health, policy, and research.
Current efforts to define and implement health data standards are driven by issues related to the quality, cost and continuity of care, patient safety concerns, and desires to speed clinical research findings to the bedside. The President's goal for national adoption of electronic medical records in the next decade, coupled with the current emphasis on translational research, underscore the urgent need for data standards in clinical research. This paper reviews the motivations and requirements for standardized clinical research data, and the current state of standards development and adoption--including gaps and overlaps--in relevant areas. Unresolved issues and informatics challenges related to the adoption of clinical research data and terminology standards are mentioned, as are the collaborations and activities the authors perceive as most likely to address them.
Background:The national mandate for health systems to transition from ICD-9-CM to ICD-10-CM in October 2015 has an impact on research activities. Clinical phenotypes defined by ICD-9-CM codes need to be converted to ICD-10-CM, which has nearly four times more codes and a very different structure than ICD-9-CM.Methods:We used the Centers for Medicare & Medicaid Services (CMS) General Equivalent Maps (GEMs) to translate, using four different methods, condition-specific ICD-9-CM code sets used for pragmatic trials (n=32) into ICD-10-CM. We calculated the recall, precision, and F score of each method. We also used the ICD-9-CM and ICD-10-CM value sets defined for electronic quality measure as an additional evaluation of the mapping methods.Results:The forward-backward mapping (FBM) method had higher precision, recall and F-score metrics than simple forward mapping (SFM). The more aggressive secondary (SM) and tertiary mapping (TM) methods resulted in higher recall but lower precision. For clinical phenotype definition, FBM was the best (F=0.67), but was close to SM (F=0.62) and TM (F=0.60), judging on the F-scores alone. The overall difference between the four methods was statistically significant (one-way ANOVA, F=5.749, p=0.001). However, pairwise comparisons between FBM, SM, and TM did not reach statistical significance. A similar trend was found for the quality measure value sets.Discussion:The optimal method for using the GEMs depends on the relative importance of recall versus precision for a given use case. It appears that for clinically distinct and homogenous conditions, the recall of FBM is sufficient. The performance of all mapping methods was lower for heterogeneous conditions. Since code sets used for phenotype definition and quality measurement can be very similar, there is a possibility of cross-fertilization between the two activities.Conclusion:Different mapping approaches yield different collections of ICD-10-CM codes. All methods require some level of human validation.
There is little semantic agreement in coding of clinical research data items across coders from 3 professional coding services, even using a very liberal definition of agreement.
Patient registries are essential tools for public health surveillance and research inquiry, and are a particularly important resource for understanding rare diseases. Registries provide consistent data for defined populations and can support the study of the distribution and determinants of various diseases. One advantage of registries is the ability to observe caseload and population characteristics over time, which might facilitate the evaluation of disease incidence, disease etiology, planning, operation and evaluation of services, evaluation of treatment patterns, and diagnostic classification. Any registry program must collect high quality data to be useful for its stated purpose. Registries can be developed for many different needs, and caution should be taken in interpreting registry data, which has inherent biases. We describe the methodological issues, limitations, and ideal features of registries to support various rare disease purposes. The future impact of registries on our understanding and interventions for rare diseases will depend upon technological and political solutions for global cooperation to achieve consistent data (via standards) and regulations for various registry applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.