Dissonance Between Human and Machine Understanding

Zhang, Zijian; Singh, Jaspreet; Gadiraju, Ujwal; Anand, Avishek

doi:10.1145/3359158

Cited by 58 publications

(41 citation statements)

References 53 publications

(52 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another limitation is that for computing human MEPIs in the case of Crop and Combined, the void image begins with a central pixel, which may be far from the relevant region for classification. Refinements of the human MEPI framework would be interesting to explore in the future, for example to replace Crop with a more higher-level component-based analysis [31].…”

Section: Discussionmentioning

confidence: 99%

“…It would further be interesting to explore how models trained in such a manner perform in more traditional classification metrics -error rates, precision, etc. -on the original input images, as well as whether or not they might help to address the observed lack of robustness of state-of-the-art DNNs in the presence of noisy [2,3,14,22] or incomplete information [12,19,24,27,28,31], or their lack of generalisation [7], or their bias towards texture [6]. It is important to note that the entropy of an image may increase as certain distortions of practical interest are intensified: more suitable general measures of robustness in such settings are left to be explored.…”

Section: Discussionmentioning

confidence: 99%

“…They showed that fragile recognition images are abundant and can occur at different sizes. Zhang et al [31] compare human and machine performance for image classification over segments of an image based on object boundaries (rather than rectangular regions). Both human and machine classifiers are used to identify key segments for classification, where, interestingly, humans are found to be better at classifying images using segments selected by machine models versus those selected by humans.…”

Section: Related Workmentioning

confidence: 99%

“…As discussed by Russakovsky et al [22], the 1,000 detailed classes of ILSVRC (e.g., coucal, sealyham terrier) require expert training for humans to perform adequately at classification, not only to visually distinguish the objects, but also to retrieve their label from the 1,000 options. Thus, along similar lines to experiments conducted by Geirhos et al [7] and Zhang et al [31], we frame the task for the following twenty high-level classes: bear, bird, cat, dog, fish, flower, fox, fruit, fungus, hippopotamus, insect, lion, monkey, reptile, shark, spider, tiger, vegetable, vehicle, wolf. We select these classes as they should be generally recognisable to humans without prior training; furthermore, we select mostly plants and animals to provide a more challenging classification task, with visually similar classes, such as lion/tiger, fruit/vegetable, dog/wolf, insect/spider, etc., providing non-trivial cases to visually distinguish.…”

Section: Datamentioning

confidence: 99%

“…Works on adversarial examples [1,20,26], for instance, establish that human and machine perception diverges greatly for specifically constructed images. Other works have presented bespoke experiments comparing human and machine performance beyond classification errors, presenting evidence for a lack of robustness in the presence of noisy [2,3,22] or incomplete information [12,19,24,27,28,31], a sensitivity to spatial [4,5,10,30] or colour [13,14] transformations, a lack of generalisation [7], a bias towards texture [6], etc., in the machine classifiers studied. By transforming test images prior to classification, these works provide insights into the differing types of information that humans and machines rely on for image classification.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Laconic Image Classification: Human vs. Machine Performance

Carrasco

Hogan

Pérez

2020

Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

We propose laconic classification as a novel way to understand and compare the performance of diverse image classifiers. The goal in this setting is to minimise the amount of information (aka. entropy) required in individual test images to maintain correct classification. Given a classifier and a test image, we compute an approximate minimal-entropy positive image for which the classifier provides a correct classification, becoming incorrect upon any further reduction. The notion of entropy offers a unifying metric that allows to combine and compare the effects of various types of reductions (e.g., crop, colour reduction, resolution reduction) on classification performance, in turn generalising similar methods explored in previous works. Proposing two complementary frameworks for computing the minimal-entropy positive images of both human and machine classifiers, in experiments over the ILSVRC test-set, we find that machine classifiers are more sensitive entropy-wise to reduced resolution (versus cropping or reduced colour for machines, as well as reduced resolution for humans), supporting recent results suggesting a texture bias in the ILSVRC-trained models used. We also find, in the evaluated setting, that humans classify the minimalentropy positive images of machine models with higher precision than machines classify those of humans. CCS CONCEPTS • Information systems → Multimedia information systems; • Computing methodologies → Neural networks.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Datamentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Laconic Image Classification: Human vs. Machine Performance

Carrasco

Hogan

Pérez

2020

Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

show abstract

Just the Right Mood for HIT!

Qiu

Gadiraju

Bozzon

2020

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Conversational agents are playing an increasingly important role in providing users with natural communication environments, improving outcomes in a variety of domains in human-computer interaction. Crowdsourcing marketplaces are simultaneously flourishing, and it has never been easier to acquire large-scale human input from online workers. Recent works have revealed the potential of conversational interfaces in improving worker engagement and satisfaction. At the same time, worker moods have been shown to have significant effects on quality related outcomes. Little is known about the role of worker moods in shaping work in conversational microtask crowdsourcing. In this paper, we conducted a crowdsourcing study addressing 600 unique online workers, to investigate the role that worker moods play in conversational microtask crowdsourcing. We also explore whether suitable conversational styles of the agent can affect the performance of workers in different moods. Our results show that workers in a pleasant mood tend to produce significantly higher quality results (over 20%), exhibit greater engagement (an increase by around 19%) and report a lower cognitive load (by over 12%), and a suitable conversational style can have a significant impact on workers in different moods. Our findings advance the current understanding of conversational microtask crowdsourcing and have important implications on designing future conversational crowdsourcing systems.

show abstract