2021
DOI: 10.48550/arxiv.2102.05884
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

OpinionRank: Extracting Ground Truth Labels from Unreliable Expert Opinions with Graph-Based Spectral Ranking

Abstract: As larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information with which to train sophisticated models. To address this problem, crowdsourcing has emerged as a popular, inexpensive, and efficient data mining solution for performing distributed label collection. However, crowdsourced annotations are inherently untrustworthy, as the labels are provided by anonymous volunteers who may have varyin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 23 publications
(32 reference statements)
0
2
0
Order By: Relevance
“…For the MNIST dataset, we use auxiliary deep generative models as our SSL algorithm due to its small parameter footprint [50]; for the SVHN dataset, we chose the FixMatch algorithm as representative of the current state-of-the-art for semi-supervised learning [75]. For both datasets, we use OpinionRank as our learning from crowds algorithm due to its nonparametric nature and fast performance [18], and selected DivideMix for learning from noisy labels due to its state-of-the-art performance on a wide variety of noisy labels tasks [44].…”
Section: Comparisons To State-of-the-art Under Adversarial Label Noisementioning
confidence: 99%
“…For the MNIST dataset, we use auxiliary deep generative models as our SSL algorithm due to its small parameter footprint [50]; for the SVHN dataset, we chose the FixMatch algorithm as representative of the current state-of-the-art for semi-supervised learning [75]. For both datasets, we use OpinionRank as our learning from crowds algorithm due to its nonparametric nature and fast performance [18], and selected DivideMix for learning from noisy labels due to its state-of-the-art performance on a wide variety of noisy labels tasks [44].…”
Section: Comparisons To State-of-the-art Under Adversarial Label Noisementioning
confidence: 99%
“…For many of these applications, hardware accelerators such as Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs) are a highly-effective solution, especially when mixed-precision and reduced-precision arithmetic come into play [1]- [6]. As spectral methods become ubiquitous in the large scale graph pipelines of Spectral Clustering [7], Information Retrieval (IR) [8] and ranking [9], such techniques require algorithms that can compute only a subset of the most relevant eigenvalues (i.e. the largest in modulo) and their associated eigenvectors while taking advantage of the sparsity of real-world graphs.…”
Section: Introductionmentioning
confidence: 99%