Attention mechanisms have become a popular component in deep neural networks, yet there has been little examination of how different influencing factors and methods for computing attention from these factors affect performance. Toward a better general understanding of attention mechanisms, we present an empirical study that ablates various spatial attention elements within a generalized attention formulation, encompassing the dominant Transformer attention as well as the prevalent deformable convolution and dynamic convolution modules. Conducted on a variety of applications, the study yields significant findings about spatial attention in deep networks, some of which run counter to conventional understanding. For example, we find that the comparison of query and key content in Transformer attention is negligible for self-attention, but vital for encoder-decoder attention. On the other hand, a proper combination of deformable convolution with key content saliency achieves the best accuracy-efficiency tradeoff in self-attention. Our results suggest that there exists much room for improvement in the design of attention mechanisms.
Accurate detection and tracking of objects is vital for effective video understanding. In previous work, the two tasks have been combined in a way that tracking is based heavily on detection, but the detection benefits marginally from the tracking. To increase synergy, we propose to more tightly integrate the tasks by conditioning the object detection in the current frame on tracklets computed in prior frames. With this approach, the object detection results not only have high detection responses, but also improved coherence with the existing tracklets. This greater coherence leads to estimated object trajectories that are smoother and more stable than the jittered paths obtained without tracklet-conditioned detection. Over extensive experiments, this approach is shown to achieve state-of-the-art performance in terms of both detection and tracking accuracy, as well as noticeable improvements in tracking stability.
Background Mathematical expressions mainly include arithmetic (such as 8 − (1 + 3)) and algebraic expressions (such as a − (b + c)). Previous studies shown that both algebraic processing and arithmetic involved the bilateral parietal brain regions. Although behavioral and neuropsychological studies have revealed the dissociation between algebra and arithmetic, how algebraic processing is dissociated from arithmetic in brain networks is still unclear. Methods Using functional magnetic resonance imaging (fMRI), this study scanned 30 undergraduates and directly compared the brain activation during algebra and arithmetic. Brain activations, single-trial (item-wise) interindividual correlation and mean-trial interindividual correlation related to algebra processing were compared with those related to arithmetic. Results Brain activation analyses showed that algebra elicited greater activation in the angular gyrus and arithmetic elicited greater activation in the bilateral supplementary motor area, left insula, and left inferior parietal lobule. Interindividual single-trial brain-behavior correlation revealed significant brain-behavior correlations in the semantic network, including the middle temporal gyri, inferior frontal gyri, dorsomedial prefrontal cortices, and left angular gyrus, for algebra. For arithmetic, the significant brain-behavior correlations were located in the phonological network, including the precentral gyrus and supplementary motor area, and in the visuospatial network, including the bilateral superior parietal lobules. Conclusion These findings suggest that algebra relies on the semantic network and arithmetic relies on the phonological and visuospatial networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.