2019
DOI: 10.1007/978-3-030-32251-9_73
|View full text |Cite
|
Sign up to set email alerts
|

Let’s Agree to Disagree: Learning Highly Debatable Multirater Labelling

Abstract: Classification and differentiation of small pathological objects may greatly vary among human raters due to differences in training, expertise and their consistency over time. In a radiological setting, objects commonly have high within-class appearance variability whilst sharing certain characteristics across different classes, making their distinction even more difficult. As an example, markers of cerebral small vessel disease, such as enlarged perivascular spaces (EPVS) and lacunes, can be very varied in th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 18 publications
(25 citation statements)
references
References 8 publications
0
25
0
Order By: Relevance
“…We propose and argue for a paradigm shift, which could move away from monolithic, majority-aggregated gold standard datasets, towards the adoption of methods that more comprehensively and inclusively integrate the opinions and perspectives of the human subjects involved in the knowledge representation step of modeling processes. Our proposal comes with important and still-to-investigate implications: first, supervised models equipped with full, non-aggregated annotations have been reported to exhibit a better prediction capability [2,14,43], in virtue of a better representation of the phenomena of interest; secondly, new tech-niques for AI explainability can be devised that describe the classifications of the model in terms of multiple and alternative (if not complementary) perspectives [7,36]; finally, we should consider the ethical implications of the above mentioned shift and its impact on cognitive computing, whereas the new generation of models can give voice to, and express, a diversity of perspectives, rather than being a mere reflection of the majority [36,39].…”
Section: Motivations and Backgroundmentioning
confidence: 99%
See 2 more Smart Citations
“…We propose and argue for a paradigm shift, which could move away from monolithic, majority-aggregated gold standard datasets, towards the adoption of methods that more comprehensively and inclusively integrate the opinions and perspectives of the human subjects involved in the knowledge representation step of modeling processes. Our proposal comes with important and still-to-investigate implications: first, supervised models equipped with full, non-aggregated annotations have been reported to exhibit a better prediction capability [2,14,43], in virtue of a better representation of the phenomena of interest; secondly, new tech-niques for AI explainability can be devised that describe the classifications of the model in terms of multiple and alternative (if not complementary) perspectives [7,36]; finally, we should consider the ethical implications of the above mentioned shift and its impact on cognitive computing, whereas the new generation of models can give voice to, and express, a diversity of perspectives, rather than being a mere reflection of the majority [36,39].…”
Section: Motivations and Backgroundmentioning
confidence: 99%
“…On the other hand, we would speak of strong perspectivism whenever the researchers' aim is to collect multiple labels, or multiple data about each class, about a specific object, and keep them all in the subsequent phases of training or benchmarking of the classification models. Doing so certainly impacts model training and evaluation, but can be realized in several ways, of varying complexity [14,43,46]. The easiest way, that is the most backward-compatible way that does not require ad-hoc implementations, is to replicate each object in the training set to reflect the number of times this object has been associated with a certain label by the raters [51]; nonetheless, also other methods have been proposed in the literature, that we will better describe in Section 4.…”
Section: Strong and Weak Data Perspectivismmentioning
confidence: 99%
See 1 more Smart Citation
“…However, with the current trend of acquiring and exploiting very large imaging datasets, the time and resources required to perform this visual QC have become prohibitive. Furthermore, as with other rating tasks, visual QC is subject to inter and intra-rater variability due to differences in radiological training, rater competence, and sample appearance [13]. Some artefacts, such as those caused by motion, can also be difficult to detect, as their identification requires the careful examination of every slice in a volume.…”
Section: Introductionmentioning
confidence: 99%
“…However, in many cases there exist large variations among different annotators due to various reasons including human factors and image qualities. For variations caused by human factors such as differences in annotators' training, expertise and consistency over time, [12] and [14] present methods to train DNNs to learn the behaviour of individual annotators as well as their consensus. As such, the resulting performance is much better than that can be achieved by learning from one annotator alone.…”
Section: Introductionmentioning
confidence: 99%