2016
DOI: 10.1109/tcbb.2015.2453944
|View full text |Cite
|
Sign up to set email alerts
|

Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction

Abstract: Understanding complex biological phenomena involves answering complex biomedical questions on multiple biomolecular information simultaneously, which are expressed through multiple genomic and proteomic semantic annotations scattered in many distributed and heterogeneous data sources; such heterogeneity and dispersion hamper the biologists' ability of asking global queries and performing global evaluations. To overcome this problem, we developed a software architecture to create and maintain a Genomic and Prot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
5
3

Relationship

3
5

Authors

Journals

citations
Cited by 23 publications
(10 citation statements)
references
References 36 publications
0
9
0
Order By: Relevance
“…The data management backend has been produced in the last 2 years, capitalizing on several previous years of experience in the use of bio-ontologies for specific research projects (e.g. SOS-GeM (41) and GPKB (42)). The repository currently integrates about 40 million metadata items from five sources, described by 39 attributes over eight connected tables of the core schema and enriched with terms from eight different ontologies, which have been reduced to the same knowledge schema.…”
Section: Resultsmentioning
confidence: 99%
“…The data management backend has been produced in the last 2 years, capitalizing on several previous years of experience in the use of bio-ontologies for specific research projects (e.g. SOS-GeM (41) and GPKB (42)). The repository currently integrates about 40 million metadata items from five sources, described by 39 attributes over eight connected tables of the core schema and enriched with terms from eight different ontologies, which have been reduced to the same knowledge schema.…”
Section: Resultsmentioning
confidence: 99%
“…A common approach in integrated data management is data warehousing, consisting of a-priori integration and reconciliation of data extracted from multiple sources, such as in EnsMart/BioMart [13,34]. Along this direction, [22] describes a warehouse for integrating genomic and proteomic information using generalization hierarchies and a modular, multilevel global schema to overcome differences among data sources. ER modeling (and UML class diagrams) were used in [5]; models describe protein structures and genomic sequences, with rather complex concepts aiming at completely representing the underlying biology.…”
Section: Related Workmentioning
confidence: 99%
“…We ran our prediction tests on the datasets of Gene Ontology annotations of Homo sapiens genes available in the Genomic and Proteomic Data Warehouse (GPDW) [29], [30], an integrated data resource publicly and freely available from Politecnico di Milano at http://www.bioinformatics.deib.polimi.it/GPKB/ which includes multiple versions of annotation datasets. We applied our prediction pipeline on the annotations of the July 2009 GPDW version and then validated the predicted annotations by looking for them in the March 2013 GPDW version [31].…”
Section: Prediction Pipeline and Datasetsmentioning
confidence: 99%
“…We applied our prediction pipeline on the annotations of the July 2009 GPDW version and then validated the predicted annotations by looking for them in the March 2013 GPDW version [31]. Despite the March 2013 not being the most updated GPDW version, we used it because it is one of the most stable and accurate versions recently delivered [29]. We chose the Homo sapiens gene annotations to the three GO sub-ontologies (Biological Process, Molecular Function, Cellular Component) because they include representative numbers of genes and GO terms.…”
Section: Prediction Pipeline and Datasetsmentioning
confidence: 99%