2015
DOI: 10.1007/978-3-319-22849-5_17
|View full text |Cite
|
Sign up to set email alerts
|

Uncertain Groupings: Probabilistic Combination of Grouping Data

Abstract: A bioinformatician has a large number of homology data sources to choose from. These data sources need to be combined before a query can be posed over the combined data. We propose a generic probabilistic approach to combining grouping data from multiple sources. Our approach incorporates an iteratively evolving view on trust, allowing the bioinformatician to express his fine-grained view on how much the data in the sources can be trusted. We evaluate our approach by combining 3 real-world biological databases… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…The latter presents itself both in the number of partitionings as well as in the size of the descriptive sentences. From our experience with a bio-informatics use case [6], the number of partitionings can easily grow into the thousands in real-world applications. The size of the descriptive sentences is determined by the complexity of the dependencies between assertions, its low-level representation, and allowed expressiveness.…”
Section: Optimizationsmentioning
confidence: 99%
See 2 more Smart Citations
“…The latter presents itself both in the number of partitionings as well as in the size of the descriptive sentences. From our experience with a bio-informatics use case [6], the number of partitionings can easily grow into the thousands in real-world applications. The size of the descriptive sentences is determined by the complexity of the dependencies between assertions, its low-level representation, and allowed expressiveness.…”
Section: Optimizationsmentioning
confidence: 99%
“…In our research we actively apply this technology for soft computing data processing tasks such as indeterministic deduplication [4], probabilistic XML data integration [5], and probabilistic integration of data about groupings [6]. Based on these experiences, we find that there are still important open problems in dealing with uncertain data and that the available systems are inadequate on certain aspects.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Throughout the paper we use an information extraction scenario as running example: the "Paris Hilton example". Although this scenario is from the Natural Language Processing (NLP) domain, note that it is equally applicable to other data integration scenarios such as semantic duplicates [7], entity resolution, uncertain groupings [8], etc.…”
Section: Running Examplementioning
confidence: 99%
“…For details on the first phase, we refer to [2,3], as well as [7][8][9] for techniques on specific extraction and integration problems (merging semantic duplicates, merging grouping data, and information extraction from natural language text, respectively). This paper focuses on the second phase of this process, namely on the problem of how to incorporate evidence of users in the probabilistically integrated data with the purpose to continuously improve its quality as more evidence is gathered.…”
Section: Introductionmentioning
confidence: 99%