2010
DOI: 10.1515/jib-2010-119
|View full text |Cite
|
Sign up to set email alerts
|

Quality controls in integrative approaches to detect errors and inconsistencies in biological databases

Abstract: SummaryNumerous biomolecular data are available, but they are scattered in many databases and only some of them are curated by experts. Most available data are computationally derived and include errors and inconsistencies. Effective use of available data in order to derive new knowledge hence requires data integration and quality improvement. Many approaches for data integration have been proposed. Data warehousing seams to be the most adequate when comprehensive analysis of integrated data is required. This … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…We import them from several well known public databases, including Entrez Gene, UniProt, IntAct, MINT, BioCyc, KEGG, Reactome, GO, GOA and OMIM, The very numerous data integrated, which regard biomolecular entities (mainly genes and proteins) and their biomedical features and associations, are all checked for data correctness and consistency (Ghisalberti et al, 2010). By leveraging imported similarity and historical evaluation data available, we identify different IDs from different data sources as representing the same entity.…”
Section: Methodsmentioning
confidence: 99%
“…We import them from several well known public databases, including Entrez Gene, UniProt, IntAct, MINT, BioCyc, KEGG, Reactome, GO, GOA and OMIM, The very numerous data integrated, which regard biomolecular entities (mainly genes and proteins) and their biomedical features and associations, are all checked for data correctness and consistency (Ghisalberti et al, 2010). By leveraging imported similarity and historical evaluation data available, we identify different IDs from different data sources as representing the same entity.…”
Section: Methodsmentioning
confidence: 99%
“…In [Chen et al, 2007] authors propose an ontology-based framework to detect the inconsistency in biological databases. This task is approached as a quality control problem in [Ghisalberti et al, 2010]. In [Park et al, 2011], as many as 27 databases are used to check GO and, besides syntactic errors, semantic inconsistencies are checked concerning redundancy and use of species-specific definitions.…”
Section: Biological Ontologiesmentioning
confidence: 99%
“…In order to ease maintenance and extension of the integrated data schema defined, each feature module is internally organized in two levels: an import tier and an aggregation tier. The import tier allows structuring and locating together originally distributed data, while thoroughly checking their consistency and quality [22], as well as identifying the feature they refer to and their main attributes, and associating each feature entry with a unique OID. The import tier is composed of separated subschemas, each one for every single data source considered which provides data for that feature, individually structured as in the original data source, i.e.…”
Section: Table 1 Example Of Gene Feature Entries and Main Attributesmentioning
confidence: 99%