Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition

Bera, Asish; Wharton, Zachary; Liu, Yonghuai; Bessis, Nik; Behera, Ardhendu

doi:10.1109/tip.2021.3064256

Cited by 32 publications

(15 citation statements)

References 58 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The computer vision-based sports action recognition systems can provide rapid postmatch analysis and real-time objective feedback before the next race for coaches and players. Bera et al [ 12 ] pointed out that the fundamental points of athletes' actions could be captured via three-dimensional video shooting techniques. Yu and Chin [ 13 ] recorded athletes' time-spatial action images via radio frequency technology in IoT, which turned out to be blurred and could not be seen clearly.…”

Section: Introductionmentioning

confidence: 99%

Physical Education Teaching Strategy under Internet of Things Data Computing Intelligence Analysis

Zhang

Hou

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Racket sports such as tennis are amongst the most popular recreational sports activities. Optimizing tennis teaching methods and improving teaching modes can effectively improve the teaching quality of tennis. In this study, a video and image action recognition system based on image processing techniques and Internet of things is developed to overcome the shortcomings of the traditional tennis teaching methods. To validate its performance, the students of tennis courses are divided into experimental group and control group, respectively. The control group is taught by using the traditional tennis teaching method whereas the experimental group is taught by using the IoT video and image recognition teaching system. Three factors of students including service throwing height, arm elbow angle, and knee bending angles of both groups are measured and compared with those of world elite tennis players. The results show that the students’ serving abilities in the experimental group are significantly improved using the video and image recognition system based on IoT, and they are better than those of the students in the control group. The proposed video and image processing technique can be applied in students’ physical education and can be employed to provide the basis for the innovation of tennis teaching strategies in physical education.

show abstract

Section: Introductionmentioning

confidence: 99%

Physical Education Teaching Strategy under Internet of Things Data Computing Intelligence Analysis

Zhang

Hou

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

show abstract

“…[9] presented an ensemble of four CNN models to handle different parts of the driver, including the face, hands, and body, to recognize driver activity. [38] proposed an attend and guide network to classify driver behavior by obtaining the spatial structures of images through the identification of semantic regions and their spatial distributions. [39] concatenated three CNN models to construct a hybrid framework for detecting distracted driver behavior.…”

Section: Related Workmentioning

confidence: 99%

Driver Anomaly Quantification for Intelligent Vehicles: A Contrastive Learning Approach With Representation Clustering

Xing

Gu³

et al. 2023

IEEE Trans. Intell. Veh.

View full text Add to dashboard Cite

Driver anomaly quantification is a fundamental capability to support human-centric driving systems of intelligent vehicles. Existing studies usually treat it as a classification task and obtain discrete levels for abnormalities. Meanwhile, the existing data-driven approaches depend on the quality of dataset and provide limited recognition capability for unknown activities.To overcome these challenges, this paper proposes a contrastive learning approach with the aim of building a model that can quantify driver anomalies with a continuous variable. In addition, a novel clustering supervised contrastive loss is proposed to optimize the distribution of the extracted representation vectors to improve the model performance. Compared with the typical contrastive loss, the proposed loss can better cluster normal representations while separating abnormal ones. The abnormality of driver activity can be quantified by calculating the distance to a set of representations of normal activities rather than being produced as the direct output of the model. The experiment results with datasets under different modes demonstrate that the proposed approach is more accurate and robust than existing ones in terms of recognition and quantification of unknown abnormal activities.

show abstract

“…Moreover, partbased methods limit both scalability and practicality of realworld FGVC applications. Thus, many recent methods have used image-level labels to guide their models in identifying the key object parts to discriminate the sub-categories by exploring attention mechanisms in the image space or feature space [7]- [10] to automatically mine discriminative features.…”

Section: Introductionmentioning

confidence: 99%

SR-GNN: Spatial Relation-Aware Graph Neural Network for Fine-Grained Image Categorization

Bera

Wharton

Liu

et al. 2022

IEEE Trans. on Image Process.

Self Cite

View full text Add to dashboard Cite

Over the past few years, a significant progress has been made in deep convolutional neural networks (CNNs)-based image recognition. This is mainly due to the strong ability of such networks in mining discriminative object pose and parts information from texture and shape. This is often inappropriate for fine-grained visual classification (FGVC) since it exhibits high intra-class and low inter-class variances due to occlusions, deformation, illuminations, etc. Thus, an expressive feature representation describing global structural information is a key to characterize an object/ scene. To this end, we propose a method that effectively captures subtle changes by aggregating contextaware features from most relevant image-regions and their importance in discriminating fine-grained categories avoiding the bounding-box and/or distinguishable part annotations. Our approach is inspired by the recent advancement in self-attention and graph neural networks (GNNs) approaches to include a simple yet effective relation-aware feature transformation and its refinement using a context-aware attention mechanism to boost the discriminability of the transformed feature in an end-to-end learning process. Our model is evaluated on eight benchmark datasets consisting of fine-grained objects and human-object interactions. It outperforms the state-of-the-art approaches by a significant margin in recognition accuracy.

show abstract

Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition

Cited by 32 publications

References 58 publications

Physical Education Teaching Strategy under Internet of Things Data Computing Intelligence Analysis

Physical Education Teaching Strategy under Internet of Things Data Computing Intelligence Analysis

Driver Anomaly Quantification for Intelligent Vehicles: A Contrastive Learning Approach With Representation Clustering

SR-GNN: Spatial Relation-Aware Graph Neural Network for Fine-Grained Image Categorization

Contact Info

Product

Resources

About