Meeting the conflicting goals of protecting and maintaining control over sensitive data while also allowing access by third parties constitutes a significant challenge. Secure data infrastructures support data visiting in a highly controlled and monitored environment which, if properly set-up and operated, provide high security guarantees through a combination of technical, legal and procedural mechanisms. To ease the process of deploying such a secure data infrastructure, we present a detailed documentation of the architecture and processes of such an infrastructure and provide a pre-configured reference implementation based entirely on open source software that can be flexibly configured to meet differing security requirements and deployment scenarios.We combine mechanisms for data visiting on secured infrastructure components with optional components of data anonymization and fingerprinting, covered by extensive logging and monitoring functions and embedded in defined processes and contractual frameworks. The set-up is based upon the experience of operating such a secure infrastructure in the medical domain for almost ten years, addressing the emerging need to make such a solution available to a larger set of stakeholders. We show that our system significantly enhances data visiting, offers a higher level of data isolation and present our open source reference implementation thereof.
Data curation is a complex, multi-faceted task. While dedicated data stewards are starting to take care of these activities in close collaboration with researchers for many types of (usually file-based) data in many institutions, this is rarely yet the case for data held in relational databases. Beyond large-scale infrastructures hosting e.g. climate or genome data, researchers usually have to create, build and maintain their database, care about security patches, and feed data into it in order to use it in their research. Data curation, if at all, usually happens after a project is finished, when data may be exported for digital preservation into file repository systems. We present DBRepo, a semantic digital repository for relational databases in a private cloud setting designed to (1) host research data stored in relational databases right from the beginning of a research project, (2) provide separation of concerns, allowing the researchers to focus on the domain aspects of the data and their work while bringing in experts to handle classic data management tasks, (3) improve findability, accessibility and reusability by offering semantic mapping of metadata attributes, and (4) focus on reproducibility in dynamically evolving data by supporting versioning and precise identification/cite-ability for arbitrary subsets of data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.