2019
DOI: 10.5334/egems.289
|View full text |Cite
|
Sign up to set email alerts
|

A Data Element-Function Conceptual Model for Data Quality Checks

Abstract: Introduction: In aggregate, existing data quality (DQ) checks are currently represented in heterogeneous formats, making it difficult to compare, categorize, and index checks. This study contributes a data element-function conceptual model to facilitate the categorization and indexing of DQ checks and explores the feasibility of leveraging natural language processing (NLP) for scalable acquisition of knowledge of common data elements and functions from DQ checks narratives. Methods:… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 36 publications
(45 reference statements)
0
7
0
Order By: Relevance
“…In 2018, there followed a contribution by Gold et al, on the challenges of concept value sets and their possible reuse; indicative of the acuity of the challenge is that not all co-authors could sign up to every view expressed in the paper 51 ! Rogers et al, 52 then analyzed data element–function combinations in checks from two environments, ultimately identifying 751 unique elements and 24 unique functions, supporting their systematic approach to DQ check definition. Most recently, Chunhua Weng has offered a lifecycle perspective, indeed a philosophy, for clinical DQ for research 53 , while Seneviratne, Kahn, and Hernandez-Boussard have provided an overview of challenges for the merging of heterogeneous data sets, with an eye both on integration across institutions (where adherence to standards may be sufficient) and across “modalities”, the latter term interpreted in its widest possible sense, encompassing genomics, imaging, and patient-reported data from wearables 54 .…”
Section: Resultsmentioning
confidence: 94%
See 1 more Smart Citation
“…In 2018, there followed a contribution by Gold et al, on the challenges of concept value sets and their possible reuse; indicative of the acuity of the challenge is that not all co-authors could sign up to every view expressed in the paper 51 ! Rogers et al, 52 then analyzed data element–function combinations in checks from two environments, ultimately identifying 751 unique elements and 24 unique functions, supporting their systematic approach to DQ check definition. Most recently, Chunhua Weng has offered a lifecycle perspective, indeed a philosophy, for clinical DQ for research 53 , while Seneviratne, Kahn, and Hernandez-Boussard have provided an overview of challenges for the merging of heterogeneous data sets, with an eye both on integration across institutions (where adherence to standards may be sufficient) and across “modalities”, the latter term interpreted in its widest possible sense, encompassing genomics, imaging, and patient-reported data from wearables 54 .…”
Section: Resultsmentioning
confidence: 94%
“…In 2018, there followed a contribution by Gold et al, on the challenges of concept value sets and their possible reuse; indicative of the acuity of the challenge is that not all co-authors could sign up to every view expressed in the paper [51]! Rogers et al, [52] then analyzed data element-function combinations in checks from two environments, ultimately identifying 751 unique elements and 24 unique functions, supporting their systematic approach to DQ check definition. Most recently, Chunhua…”
Section: Common Data Models Data Quality and Standardsmentioning
confidence: 95%
“…These were used as substrate for an inductive thematic analysis 46,47 ; since our intent was to identify stable themes in DQ literature rather than assess conformance to a previously defined set of concepts, we adopted the reflexive approach described by Braun and Clarke 48 . An initial set of labeled codes was developed by the authors based on detailed examination of 10 publications describing broad‐based DQ frameworks 1,15,16,19,20,27,28,34,42,49,50 . These codes were consolidated, and an additional 20 publications were reviewed to augment the initial list until saturation was reached.…”
Section: Methodsmentioning
confidence: 99%
“…To address the problem of describing data quality, researchers and other stakeholders have developed standardized terminologies and consensus‐driven frameworks to facilitate evaluation and communication 12‐27 For example, the harmonized consensus‐derived DQ terminology developed by Kahn et al is based on the terms conformance , completeness , and plausibility across the axes verification and validation . Weiskopf et al 27 propose a 3 x 3 DQA assessment to evaluate DQ along the context‐specific constructs time , variables , and patients .…”
Section: Introductionmentioning
confidence: 99%
“…Like previous related studies [10,11,52], we adopted an inductive and iterative approach in abstracting and codifying features and relevant considerations identified from the articles. An expanded literature review was also conducted to help refine specified features; in addition to the articles selected from the systematic search above, other articles discussing aspects relevant to developing or implementing DQA programs were reviewed, including materials such as DQ checks (rules) from large scale implementations [50,53], DQ frameworks and published best-practices [29][30][31][54][55][56][57], including those designed especially for EHR data [10,11,32,43,52,[58][59][60][61][62][63]. These additional materials were identified using Google Scholar web searches and manual searches of references in included studies.…”
Section: Data Extraction and Analysismentioning
confidence: 99%