2021
DOI: 10.1109/tmm.2020.3007321
|View full text |Cite
|
Sign up to set email alerts
|

Understanding More About Human and Machine Attention in Deep Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
26
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(26 citation statements)
references
References 67 publications
0
26
0
Order By: Relevance
“…In the block, SA layer is utilized to make the important information more distinguishable by mimicking the human vision system (HVS) [15], which is composed of two group convolutional layers, one ReLU activation and one Sigmoid activation. Figure 3 shows the design of SA layer.…”
Section: B Residual Sr Blockmentioning
confidence: 99%
See 2 more Smart Citations
“…In the block, SA layer is utilized to make the important information more distinguishable by mimicking the human vision system (HVS) [15], which is composed of two group convolutional layers, one ReLU activation and one Sigmoid activation. Figure 3 shows the design of SA layer.…”
Section: B Residual Sr Blockmentioning
confidence: 99%
“…The key issue of NR-IQA is to build a metric that in consistence with the human vision system (HVS). According to the HVS, different areas of the images hold different importance for visual perception [15]. However, recent NR-IQA methods usually neglect to distinguish the visual sensitive information in the image, which restricts the effectiveness of prediction.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…are more likely to be followed by the human gaze. Inspired by a biological mechanism known as human attention [17], the UVOS system should have remarkable motion perception capabilities to quickly orient gaze to moving objects in dynamic scenes. We argue that the primary object(s) in a video should be (i) the most distinguishable in a single frame, (ii) repeatedly appearing throughout the video sequence, and (iii) moving objects in the video.…”
Section: Introductionmentioning
confidence: 99%
“…However, a natural question is whether ANNs select information in the same way, and in particular whether they attend to the same visual regions as humans when extracting information for visual object recognition and localization. While prior work has developed ANNs trained explicitly to predict human visual gaze [14], and even incorporated simulated foveated systems into the model design [15], comparatively little work comparing human attention to computational attention [16,17,18] has attempted a comprehensive examination of how ANNs compare to humans using a variety of human visual selectivity measures as well as the wide range of interpretability techniques that are currently available to probe what visual information ANNs use.…”
Section: Introductionmentioning
confidence: 99%