Background Provenance supports the understanding of data genesis, and it is a key factor to ensure the trustworthiness of digital objects containing (sensitive) scientific data. Provenance information contributes to a better understanding of scientific results and fosters collaboration on existing data as well as data sharing. This encompasses defining comprehensive concepts and standards for transparency and traceability, reproducibility, validity, and quality assurance during clinical and scientific data workflows and research. Objective The aim of this scoping review is to investigate existing evidence regarding approaches and criteria for provenance tracking as well as disclosing current knowledge gaps in the biomedical domain. This review covers modeling aspects as well as metadata frameworks for meaningful and usable provenance information during creation, collection, and processing of (sensitive) scientific biomedical data. This review also covers the examination of quality aspects of provenance criteria. Methods This scoping review will follow the methodological framework by Arksey and O'Malley. Relevant publications will be obtained by querying PubMed and Web of Science. All papers in English language will be included, published between January 1, 2006 and March 23, 2021. Data retrieval will be accompanied by manual search for grey literature. Potential publications will then be exported into a reference management software, and duplicates will be removed. Afterwards, the obtained set of papers will be transferred into a systematic review management tool. All publications will be screened, extracted, and analyzed: title and abstract screening will be carried out by 4 independent reviewers. Majority vote is required for consent to eligibility of papers based on the defined inclusion and exclusion criteria. Full-text reading will be performed independently by 2 reviewers and in the last step, key information will be extracted on a pretested template. If agreement cannot be reached, the conflict will be resolved by a domain expert. Charted data will be analyzed by categorizing and summarizing the individual data items based on the research questions. Tabular or graphical overviews will be given, if applicable. Results The reporting follows the extension of the Preferred Reporting Items for Systematic reviews and Meta-Analyses statements for Scoping Reviews. Electronic database searches in PubMed and Web of Science resulted in 469 matches after deduplication. As of September 2021, the scoping review is in the full-text screening stage. The data extraction using the pretested charting template will follow the full-text screening stage. We expect the scoping review report to be completed by February 2022. Conclusions Information about the origin of healthcare data has a major impact on the quality and the reusability of scientific results as well as follow-up activities. This protocol outlines plans for a scoping review that will provide information about current approaches, challenges, or knowledge gaps with provenance tracking in biomedical sciences. International Registered Report Identifier (IRRID) DERR1-10.2196/31750
Background The anonymization of Common Data Model (CDM)–converted EHR data is essential to ensure the data privacy in the use of harmonized health care data. However, applying data anonymization techniques can significantly affect many properties of the resulting data sets and thus biases research results. Few studies have reviewed these applications with a reflection of approaches to manage data utility and quality concerns in the context of CDM-formatted health care data. Objective Our intended scoping review aims to identify and describe (1) how formal anonymization methods are carried out with CDM-converted health care data, (2) how data quality and utility concerns are considered, and (3) how the various CDMs differ in terms of their suitability for recording anonymized data. Methods The planned scoping review is based on the framework of Arksey and O'Malley. By using this, only articles published in English will be included. The retrieval of literature items should be based on a literature search string combining keywords related to data anonymization, CDM standards, and data quality assessment. The proposed literature search query should be validated by a librarian, accompanied by manual searches to include further informal sources. Eligible articles will first undergo a deduplication step, followed by the screening of titles. Second, a full-text reading will allow the 2 reviewers involved to reach the final decision about article selection, while a domain expert will support the resolution of citation selection conflicts. Additionally, key information will be extracted, categorized, summarized, and analyzed by using a proposed template into an iterative process. Tabular and graphical analyses should be addressed in alignment with the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) checklist. We also performed some tentative searches on Web of Science for estimating the feasibility of reaching eligible articles. Results Tentative searches on Web of Science resulted in 507 nonduplicated matches, suggesting the availability of (potential) relevant articles. Further analysis and selection steps will allow us to derive a final literature set. Furthermore, the completion of this scoping review study is expected by the end of the fourth quarter of 2023. Conclusions Outlining the approaches of applying formal anonymization methods on CDM-formatted health care data while taking into account data quality and utility concerns should provide useful insights to understand the existing approaches and future research direction based on identified gaps. This protocol describes a schedule to perform a scoping review, which should support the conduction of follow-up investigations. International Registered Report Identifier (IRRID) PRR1-10.2196/46471
The need to harness large amounts of data, possibly within a short period of time, became apparent during the Covid-19 pandemic outbreak. In 2022, the Corona Data Exchange Platform (CODEX), which had been developed within the German Network University Medicine (NUM), was extended by a number of common components, including a section on FAIR science. The FAIR principles enable research networks to evaluate how well they comply with current standards in open and reproducible science. To be more transparent, but also to guide scientists on how to improve data and software reusability, we disseminated an online survey within the NUM. Here we present the outcomes and lessons learnt.
BACKGROUND Provenance supports the understanding of data genesis and it is a key factor to ensure the trustworthiness of the digital objects containing (sensitive) scientific data. Provenance information contributes to a better understanding of scientific results and fosters collaboration on existing data as well as data-sharing. This encompasses defining comprehensive concepts and standards for transparency and traceability, reproducibility, validity and quality assurance during clinical and scientific data workflows and/or research. OBJECTIVE The aim of this scoping review is to investigate approaches and challenges for provenance tracking as well as disclosing current knowledge gaps in the area. The review covers modeling aspects as well as metadata frameworks for capturing meaningful and usable provenance information during creation, collection and processing of (sensitive) scientific biomedical data. The objective of the review also includes the examination of quality aspects of provenance criteria. METHODS The scoping review will follow the methodological framework by Arksey and O'Malley. Relevant publications will be obtained by querying PubMed and Web of Science. All articles in English language will be included, within the time period between 2006 and 23-March 2021. Database retrieval will be accompanied by manual search for grey literature. Potential publications will then be exported into a reference management software, and duplicates will be removed. Afterwards, the obtained set of papers will be transferred into a systematic review management tool. All publications will be screened, extracted and analyzed: title and abstract screening will be carried out by 4 independent reviewers. Majority vote is required for consent to eligibility of articles based on defined inclusion and exclusion criteria. Full-text reading will be performed independently by 2 reviewers and in the last step key information will be extracted on a template which has been evaluated by the reviewers beforehand. If agreement cannot be reached, the conflict will be resolved by a domain expert. Charted data will be analyzed by categorizing and summarizing the individual data items based on the research questions. Tabular or graphical overviews will be given, if applicable. RESULTS The reporting follows the extension of the PRISMA statements for scoping reviews (PRISMA-ScR). Electronic database searches in PubMed and Web of Science resulted in 469 matches after deduplication. As of June 2021, the scoping review is in the full text screening stage. The data extraction using the pretested charting template will follow the full text screening stage. We expect the scoping review report to be completed by the end of 2021. CONCLUSIONS Information about the origin of healthcare data has a major impact on the quality and the reusability of scientific results as well as follow-up activities. This scoping review will provide information about current approaches, challenges or knowledge gaps with provenance tracking in biomedical sciences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.