2022
DOI: 10.21203/rs.3.rs-1862739/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Assessment of the Consistency of Categorical Features within the DZHK Biobanking Basic Set

Abstract: Data quality in health research encompasses a broad range of aspects and indicators. While some indicators are generic and can be calculated without domain knowledge, others require information about a specific data element. Even more complex are indicators addressing contradictions, that stem from implausible combinations of multiple data elements. In this paper, we investigate how contradictions within interdependent categorical data can be identified and if they give additional information about possible qu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…The logical rules implemented in this study revealed another interdependency dimension in the consistency framework which differs from the three- and four-dimensional checks from our own previous experience in the Biobank domain 12 or the item-wise (one-dimensional) checks of Schmidt et al and Johnson et al 3 4 In one instance, this interdependency dimension requires the comparison of multiple sets of data items (severity indicators) and their dependent items (e.g. each class of COVID-19 severity has different sets of indicators as demonstrated in the section “Consistency of COVID-19 Severity and Its Indicators”).…”
Section: Discussionmentioning
confidence: 51%
See 1 more Smart Citation
“…The logical rules implemented in this study revealed another interdependency dimension in the consistency framework which differs from the three- and four-dimensional checks from our own previous experience in the Biobank domain 12 or the item-wise (one-dimensional) checks of Schmidt et al and Johnson et al 3 4 In one instance, this interdependency dimension requires the comparison of multiple sets of data items (severity indicators) and their dependent items (e.g. each class of COVID-19 severity has different sets of indicators as demonstrated in the section “Consistency of COVID-19 Severity and Its Indicators”).…”
Section: Discussionmentioning
confidence: 51%
“…In a recent study, a data quality assessment workflow specific to the Biobank domain introduced another dimension of interdependency among data items where three-way and four-way multi-item contradictions were demonstrated. 12 This framework was implemented as a SmartR plugin for the open source analysis platform tranSMART. 13 In the current work, we have adapted our design to the existing con_contradictions framework by Schmidt et al 4 to ease the use of the tool by data managers and transfer sites.…”
Section: Methodsmentioning
confidence: 99%