2016
DOI: 10.1609/hcomp.v3i1.13266
|View full text |Cite
|
Sign up to set email alerts
|

Linguistic Wisdom from the Crowd

Abstract: Crowdsourcing for linguistic data typically aims to replicate expert annotations using simplified tasks. But an alternative goal — one that is especially relevant for research in the domains of language meaning and use — is to tap into people's rich experience as everyday users of language. Research in these areas has the potential to tell us a great deal about how language works, but designing annotation frameworks for crowdsourcing of this kind poses special challenges. In this paper we define and exemplify … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…We relied on a crowd (or ensemble) scoring strategy, 19 where scores were averaged across evaluators for each exchange studied. This method is used when there is no ground truth in the outcome being studied, and the evaluated outcomes themselves are inherently subjective (eg, judging figure skating, National Institutes of Health grants, concept discovery).…”
Section: Key Pointsmentioning
confidence: 99%
“…We relied on a crowd (or ensemble) scoring strategy, 19 where scores were averaged across evaluators for each exchange studied. This method is used when there is no ground truth in the outcome being studied, and the evaluated outcomes themselves are inherently subjective (eg, judging figure skating, National Institutes of Health grants, concept discovery).…”
Section: Key Pointsmentioning
confidence: 99%
“…Due to the subjectivity of rating accuracy and comprehensiveness, we employed an ensemble (or crowd sourcing) scoring strategy, 32 , 33 by averaging the ratings of the five reviewers for each response. This is comparable to a panel of judges averaging their scores for a performance.…”
Section: Methodsmentioning
confidence: 99%
“…These scores were combined for a composite score that equally weighted the risks, benefits, alternatives, and overall impression subscores. We used an ensemble scoring strategy, averaging scores across reviewers for each response, as has been used previously in similar studies …”
Section: Methodsmentioning
confidence: 99%
“…We used an ensemble scoring strategy, averaging scores across reviewers for each response, as has been used previously in similar studies. 24,34 A multidisciplinary group of surgeons, including acute care surgery, surgical critical care, vascular surgery, orthopedic surgery, and surgical oncology, reviewed the LLM-based chatbot-and surgeon-generated procedure-specific RBAs for accuracy and completeness using the process described. Reviewers were blinded to the source of the RBA; each response was scored by at least 2 individual reviewers.…”
Section: Measurements Of Accuracy and Completenessmentioning
confidence: 99%