Background Understanding the genome, with all of its components and intrinsic relationships, is a great challenge. Conceptual modeling techniques have been used as a means to face this challenge. The heterogeneity and idiosyncrasy of genomic use cases mean that conceptual modeling techniques are used to generate conceptual schemes that focus on too specific scenarios (i.e., they are species-specific conceptual schemes). Our research group developed two different conceptual schemes. The first one is the Conceptual Schema of the Human Genome, which is intended to improve Precision Medicine and genetic diagnosis. The second one is the Conceptual Schema of the Citrus Genome, which is intended to identify the genetic cause of relevant phenotypes in the agri-food field. Methods Our two conceptual schemes have been ontologically compared to identify their similarities and differences. Based on this comparison, several changes have been performed in the Conceptual Schema of the Human Genome in order to obtain the first version of a species-independent Conceptual Schema of the Genome. Identifying the different genome information items used in each genomic case study has been essential in achieving our goal. The changes needed to provide an expanded, more generic version of the Conceptual Schema of the Human Genome are analyzed and discussed. Results This work presents a new CS called the Conceptual Schema of the Genome that is ready to be adapted to any specific working genome-based context (i.e., species-independent). Conclusion The generated Conceptual Schema of the Genome works as a global, generic element from which conceptual views can be created in order to work with any specific species. This first working version can be used in the human use case, in the citrus use case, and, potentially, in more use cases of other species.
The ability to sequence the human genome is a scientific, historical breakthrough. Although the human genome mapping is available to all scientists, information about it can be difficult to share. The Conceptual Schema of the Human Genome represents the concepts required to holistically understand the human genome. We report on our continued efforts to ensure that the human genome can be meaningfully shared by conducting an ontological analysis and enrichment of the conceptual model to facilitate domain understanding and data exchange among heterogeneous systems. The analysis and enrichment process is supported by the ontology-driven conceptual modeling language, OntoUML, to gain ontological clarity and demonstrated on a relevant section of the Pathways view of the schema. Consistent with the overall objective of designing a sound genomics information system, the results lead to major modeling implications for the: characterization of biological entities; changes in biological entities over time; and representation of chemical compounds. Our research shows that the inclusion of a strong ontological foundation in a conceptual model contributes to the design of complex systems.
The human genome is traditionally represented as a DNA sequence of three billion base pairs. However, its intricacies are captured by many more complex signals, representing DNA variations, the expression of gene activity, or DNA's structural rearrangements; a rich set of data formats is used to represent such signals. Different conceptual models explain such elaborate structure and behavior. Among them, the Conceptual Schema of the Human Genome (CSG) provides a concept-oriented, top-down representation of the genome behaviorindependent of data formats. The Genomic Conceptual Model (GCM) instead provides a data-oriented, bottom-up representation, targeting a well-organized, unified description of these formats. We hereby propose to join these two approaches to achieve a more complete vision, linking (1) a concepts layer, describing genome elements and their conceptual connections, with (2) a data layer, describing datasets derived from genome sequencing with specific technologies. The link is established when specific genomic data types are chosen in the data layer, thereby triggering the selection of a view in the concepts layer. The benefit is mutual, as data records can be semantically described by high-level concepts and exploit their links. In turn, the continuously evolving abstract model can be extended thanks to the input provided by real datasets. As a result, it will be possible to express queries that employ a holistic conceptual perspective on the genome, directly translated onto data-oriented terms and organization. The approach is here exemplified using the DNA variation data type but is applicable to all genomic information.
Background Precision medicine is a promising approach that has revolutionized disease prevention and individualized treatment. The DELFOS oracle is a model-driven genomics platform that aids clinicians in identifying relevant variations that are associated with diseases. In its previous version, the DELFOS oracle did not consider the high degree of variability of genomics data over time. However, changes in genomics data have had a profound impact on clinicians’ work and pose the need for changing past, present, and future clinical actions. Therefore, our objective in this work is to consider changes in genomics data over time in the DELFOS oracle. Methods Our objective has been achieved through three steps. First, we studied the characteristics of each database from which the DELFOS oracle extracts data. Second, we characterized which genomics concepts of the conceptual schema that supports the DELFOS oracle change over time. Third, we updated the DELFOS Oracle so that it can manage the temporal dimension. To validate our approach, we carried out a use case to illustrate how the new version of the DELFOS oracle handles the temporal dimension. Results Three events can change genomics data, namely, the addition of a new variation, the addition of a new link between a variation and a phenotype, and the update of a link between a variation and a phenotype. These events have been linked to the entities of the conceptual model that are affected by them. Finally, a new version of the DELFOS oracle that can deal with the temporal dimension has been implemented. Conclusion Huge amounts of genomics data that is associated with diseases change over time, impacting patients’ diagnosis and treatment. Including this information in the DELFOS oracle added an extra layer of complexity, but using a model-driven based approach mitigated the cost of implementing the needed changes. The new version handles the temporal dimension appropriately and eases clinicians’ work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.