2013
DOI: 10.1007/s00778-013-0324-z
|View full text |Cite
|
Sign up to set email alerts
|

Large-scale linked data integration using probabilistic reasoning and crowdsourcing

Abstract: We tackle the problems of semiautomatically matching linked data sets and of linking large collections of Web pages to linked data. Our system, ZenCrowd, (1) uses a three-stage blocking technique in order to obtain the best possible instance matches while minimizing both computational complexity and latency, and (2) identifies entities from natural language text using state-of-the-art techniques and automatically connects them to the linked open data cloud. First, we use structured inverted indices to quickly … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
35
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 78 publications
(38 citation statements)
references
References 39 publications
(61 reference statements)
0
35
0
Order By: Relevance
“…The other approach is to use a machine to narrow down the possible options and then employ the crowd to validate or chose the best matching one. As an example, the work presented in [23] employs a machine-based algorithm to classify entities along with calculating a confidence score. The authors suggest that the label crowdsourcing is required only for entities with low confidence scores produced by the classifier.…”
Section: Entity Annotation and Classificationmentioning
confidence: 99%
See 1 more Smart Citation
“…The other approach is to use a machine to narrow down the possible options and then employ the crowd to validate or chose the best matching one. As an example, the work presented in [23] employs a machine-based algorithm to classify entities along with calculating a confidence score. The authors suggest that the label crowdsourcing is required only for entities with low confidence scores produced by the classifier.…”
Section: Entity Annotation and Classificationmentioning
confidence: 99%
“…We used this level for all three workflows. Aggregation: For the T2 tasks we used the default option (aggregation='agg') 23 , as the task is to choose from a set of pre-defined options. For T1, we looked at the first three answers (aggregation='agg_3') based on 11 judgments.…”
Section: Quality Controlmentioning
confidence: 99%
“…ZenCrowd [9] identifies pairs of instances in linked data using two levels of blocking to identify candidate pairs for confirmation by the crowd. A probabilistic factor graph accumulates evidence from different sources, from which a probability is derived that a candidate pair is correct.…”
Section: Related Workmentioning
confidence: 99%
“…Many projects have already demonstrated substantial success in applying this idea to crowdsourcing settings; this applies most prominently for games-with-a purpose (GWAPs) [27], which build a game narrative around human computation tasks such as image labeling [26], protein folding, 5 or language translation. 6 Similarly to the concerns raised in the context of external rewards and incentivisation [18], gamification has been seen, in some context, to undermine intrinsic benefits by subjugating and trivialising contributions into simple game goals and achievements. 7 This effect has been called overjustification and has been the subject of various studies with intriguing results; while some negative effects of overjustification have been recurrently reproduced, current research acknowledges the fact that its prevalence seems to be highly dependent on context and that, in most cases, extrinsic rewards complement rather than hamper intrinsic motivations for participating [5,22].…”
Section: Theories Of External Reward and Incentivisationmentioning
confidence: 99%