2016
DOI: 10.1016/j.ijmedinf.2016.07.009
|View full text |Cite
|
Sign up to set email alerts
|

Validating the extract, transform, load process used to populate a large clinical research database

Abstract: Background Informaticians at any institution that are developing clinical research support infrastructure are tasked with populating research databases with data extracted and transformed from their institution’s operational databases, such as electronic health records (EHRs). These data must be properly extracted from these source systems, transformed into a standard data structure, and then loaded into the data warehouse while maintaining the integrity of these data. We validated the correctness of the extra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
28
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 54 publications
(30 citation statements)
references
References 6 publications
0
28
0
Order By: Relevance
“…In an effort to validate coding, our institution performs an annual audit for code selection as part of each reviewed encounter and uses a baseline of ≥80 % as the minimal threshold. Furthermore, we have validated the correctness of our institutional data base elements to our EMRs and found a concordance rate of >96 %, and after correcting for the apparent discordances, the database was found to be 100 % accurate [30]. Labor management guidelines that promote vaginal delivery are intended to reduce the number of unnecessary CDs.…”
Section: Discussionmentioning
confidence: 95%
“…In an effort to validate coding, our institution performs an annual audit for code selection as part of each reviewed encounter and uses a baseline of ≥80 % as the minimal threshold. Furthermore, we have validated the correctness of our institutional data base elements to our EMRs and found a concordance rate of >96 %, and after correcting for the apparent discordances, the database was found to be 100 % accurate [30]. Labor management guidelines that promote vaginal delivery are intended to reduce the number of unnecessary CDs.…”
Section: Discussionmentioning
confidence: 95%
“…Unless the data are validated for research, the quality of studies generated from EHRs may be debatable. 6 9 Furthermore, the validity of different disease definitions is not always the same in a given dataset. Some diseases (such as asthma) might be coded using less specific symptoms, whereas the validity of diagnoses with very specific symptoms (such as tension pneumothorax) is likely to be better.…”
Section: Introductionmentioning
confidence: 99%
“…The complexity of clinical data structure and diversity of medical operations requires implementation of complex ETL before the data are loaded into CDW storage. Different medical departments require various tools to connect different data sources and deal with a variety of data formats produced to apply ETLs [10,23]. To establish and implement successful CDWs, many approaches are available such as requirements on users, information, regularity and ethics.…”
Section: Cdw Characteristicsmentioning
confidence: 99%