Background Metadata are created to describe the corresponding data in a detailed and unambiguous way and is used for various applications in different research areas, for example, data identification and classification. However, a clear definition of metadata is crucial for further use. Unfortunately, extensive experience with the processing and management of metadata has shown that the term “metadata” and its use is not always unambiguous. Objective This study aimed to understand the definition of metadata and the challenges resulting from metadata reuse. Methods A systematic literature search was performed in this study following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for reporting on systematic reviews. Five research questions were identified to streamline the review process, addressing metadata characteristics, metadata standards, use cases, and problems encountered. This review was preceded by a harmonization process to achieve a general understanding of the terms used. Results The harmonization process resulted in a clear set of definitions for metadata processing focusing on data integration. The following literature review was conducted by 10 reviewers with different backgrounds and using the harmonized definitions. This study included 81 peer-reviewed papers from the last decade after applying various filtering steps to identify the most relevant papers. The 5 research questions could be answered, resulting in a broad overview of the standards, use cases, problems, and corresponding solutions for the application of metadata in different research areas. Conclusions Metadata can be a powerful tool for identifying, describing, and processing information, but its meaningful creation is costly and challenging. This review process uncovered many standards, use cases, problems, and solutions for dealing with metadata. The presented harmonized definitions and the new schema have the potential to improve the classification and generation of metadata by creating a shared understanding of metadata and its context.
Background: Retrospective research on real-world data provides the ability to gain evidence on specific topics especially when running across different sites in research networks. Those research networks have become increasingly relevant in recent years; not least due to the special situation caused by the COVID-19 pandemic. An important requirement for those networks is the data harmonization by ensuring the semantic interoperability. Aims: In this paper we demonstrate (1) how to facilitate digital infrastructures to run a retrospective study in a research network spread across university and non-university hospital sites; and (2) to answer a medical question on COVID-19 related change in diagnostic counts for diabetes-related eye diseases. Materials and methods: The study is retrospective and non-interventional and runs on medical case data documented in routine care at the participating sites. The technical infrastructure consists of the OMOP CDM and other OHDSI tools that is provided in a transferable format. An ETL process to transfer and harmonize the data to the OMOP CDM has been utilized. Cohort definitions for each year in observation have been created centrally and applied locally against medical case data of all participating sites and analyzed with descriptive statistics. Results: The analyses showed an expectable drop of the total number of diagnoses and the diagnoses for diabetes in general; whereas the number of diagnoses for diabetes-related eye diseases surprisingly decreased stronger compared to non-eye diseases. Differences in relative changes of diagnoses counts between sites show an urgent need to process multi-centric studies rather than single-site studies to reduce bias in the data. Conclusions: This study has demonstrated the ability to utilize an existing portable and standardized infrastructure and ETL process from a university hospital setting and transfer it to non-university sites. From a medical perspective further activity is needed to evaluate data quality of the utilized real-world data documented in routine care and to investigate its eligibility of this data for research.
BackgroundIn most research projects budget, staff and IT infrastructures are limiting resources. Especially for small-scale registries and cohort studies professional IT support and commercial electronic data capture systems are too expensive. Consequently, these projects use simple local approaches (e.g. Excel) for data capture instead of a central data management including web-based data capture and proper research databases. This leads to manual processes to merge, analyze and, if possible, pseudonymize research data of different study sites.ResultsTo support multi-site data capture, storage and analyses in small-scall research projects, corresponding requirements were analyzed within the MOSAIC project. Based on the identified requirements, the Toolbox for Research was developed as a flexible software solution for various research scenarios. Additionally, the Toolbox facilitates data integration of research data as well as metadata by performing necessary procedures automatically. Also, Toolbox modules allow the integration of device data. Moreover, separation of personally identifiable information and medical data by using only pseudonyms for storing medical data ensures the compliance to data protection regulations. This pseudonymized data can then be exported in SPSS format in order to enable scientists to prepare reports and analyses.ConclusionsThe Toolbox for Research was successfully piloted in the German Burn Registry in 2016 facilitating the documentation of 4350 burn cases at 54 study sites. The Toolbox for Research can be downloaded free of charge from the project website and automatically installed due to the use of Docker technology.
Objectives The TMF (Technology, Methods, and Infrastructure for Networked Medical Research) Data Protection Guide (TMF-DP) makes path-breaking recommendations on the subject of data protection in research projects. It includes comprehensive requirements for applications such as patient lists, pseudonymisation services and consent management services. Nevertheless, it lacks a structured, categorised list of requirements for simplified application in research projects and systematic evaluation. The DFG-founded 3LGM2IHE project ("Three-layer Graph-based meta model – Integrating the Healthcare Enterprise (IHE)") aims to define modeling paradigms and implement modeling tools for planning healthcare information systems. In addition, one of the goals is to create and publish 3LGM² information system architecture design patterns (short “design patterns”) for the community as design models in terms of a framework. A structured list of data protection-related requirements based depicted from the TMF-DP is a precondition to integrate functions (3LGM² Domain Layer) and building blocks (3LGM² Logical Tool Layer) in 3LGM² design patterns. Methods In order to structure the continuous text of the TMF-DP, requirement types were defined in a first step. In a second step dependencies and delineations of the definitions were identified. In a third step, the requirements from the TMF-DP were systematically extracted. Based on the identified lists of requirements, a fourth step included the comparison of the identified requirements with exemplary open source tools as provided by the "Independent Trusted Third Party of the University Medicine Greifswald" (TTP tools). Results As a result, four lists of requirements were created, which contain requirements for the 'patient list', the 'pseudonymisation service' and the 'consent management', as well as cross-component requirements from the TMF-DP chapter 6 in a structured form. Further to requirements (A), possible variants (B) of implementations (to fulfill a single requirement) and recommendations (C) were identified. A comparison of the requirements lists with the functional scopes of the open source tools E-PIX (record linkage), gPAS (pseudonym management) and gICS (consent management) has shown that these fulfill more than 80% of the requirements. Conclusions A structured set of data protection-related requirements facilitates a systematic evaluation of implementations with respect to the fulfillment of the TMF-DP guidelines. These re-usable lists provide a decision aid for the selection of suitable tools for new research projects. As a result, these lists form the basis for the development of data protection-related 3LGM²-design patterns as part of the 3LGM2IHE-project.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.