Onfocus detection: identifying individual-camera eye contact from unconstrained images

Zhang, Dingwen; Wang, Bo; Wang, Gerong; Zhang, Qiang; Zhang, Jiajia; Han, Jungong; You, Zheng

doi:10.1007/s11432-020-3181-9

Cited by 15 publications

(5 citation statements)

References 38 publications

(62 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…3) Routing layers: To study the performance of routing layers in one block, we compare our ResCaps and one modified versions, i.e., ResCaps-3L. Specifically, ResCaps and ResCaps-3L consist of two and three ResP layer(s) 4 in one block, respectively. From Table V, we find that our ResCaps consisting of two ResP layers can achieve promising performance, compared to ResCaps-3L that employs three ResP layers in one block.…”

Section: B Ablation Analysismentioning

confidence: 99%

“…They can recognize the image by detecting the existence of a specific entity, i.e., invariance. However, an unsophisticated perturbation on the image can fool a well-trained network to fail in recognition [1]- [4]. More worryingly, natural and non-adversarial pose changes of familiar objects in the real world are enough to trick deep networks [5], [6].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Capsule Networks With Residual Pose Routing

Liu,

Cheng,

Zhang

et al. 2024

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

Section: B Ablation Analysismentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Capsule Networks With Residual Pose Routing

Liu,

Cheng,

Zhang

et al. 2024

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

“…detection becomes increasingly important in understanding the intention of surrounding pedestrians in the autonomous driving environment [5], [6], [7]. Many studies on eyes contact detection use vision images taken close to a person with a clear facial appearance [8], [9], [10], [11], [12]. However, pedestrian eye detection in the wild uses distant images or videos from vehicle sensors which brings great challenges to the problem given the image quality, unconstrained surroundings and illuminations [6], [7].…”

Section: Introductionmentioning

confidence: 99%

SA-BiGCN: Bi-Stream Graph Convolution Networks With Spatial Attentions for the Eye Contact Detection in the Wild

Ling,

Ma,

Xie

et al. 2024

IEEE Trans. Intell. Transport. Syst.

View full text Add to dashboard Cite

Eye contact is essential in transmitting information and intention in the wild environment (e.g., urban streets or parking lots) with mixed vehicles and pedestrians. Compared with the vision image data, the human skeleton data are deemed to be robust to unconstrained surroundings and illumination. However, the skeleton graph-based approaches are mainly used for the action recognition. It is challenging to directly apply them to the eye detection task, which is momentary and dynamic given the complex wild environment. This paper proposes a Bi-stream Spatial Attention Graph Convolution Network (SA-BiGCN) for eye contact detection in the wild. We design a directed, nose-centric skeleton graph to capture relevant and hierarchical information and their interactions. We also propose a Bi-stream graph convolution network model with spatial attention to dynamically extract and fuse skeleton joints and bones information. The model was validated by comparing with state-of-art models on three large-scale public datasets, including JAAD, PIE, and LOOK. The results highlight the accuracy and generalization performance of the proposed SA-BiGCN model in detecting the eye contact in the wild environment. The ablation analysis validates the importance of the skeleton graph design, the spatial attention mechanism in the feature fusion process, as well as the model robustness against noisy skeleton data in terms of part occlusions, block occlusions, random occlusions, and random deviations.

show abstract

“…However, they usually assume that the training and testing data is balanced. In practice, training or testing data appears to be long-tailed, e.g., there exist few samples for rare diseases in medical diagnosis [19,35,36,39,42,45] or endangered animals in species classification [5,31,37]. As mentioned by [32], the case becomes even worse in weakly and semi-supervised learning scenarios [10,20,27,31,33,34,38,43,44].…”

Section: Introductionmentioning

confidence: 99%

Revisiting Long-tailed Image Classification: Survey and Benchmarks with New Evaluation Metrics

Fang¹,

Zhang²,

Wen³

et al. 2023

Preprint

View full text Add to dashboard Cite

Recently, long-tailed image classification harvests lots of research attention, since the data distribution is long-tailed in many real-world situations. Piles of algorithms are devised to address the data imbalance problem by biasing the training process towards less frequent classes. However, they usually evaluate the performance on a balanced testing set or multiple independent testing sets having distinct distributions with the training data. Considering the testing data may have arbitrary distributions, existing evaluation strategies are unable to reflect the actual classification performance objectively. We set up novel evaluation benchmarks based on a series of testing sets with evolving distributions. A corpus of metrics are designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution. Based on our benchmarks, we re-evaluate the performance of existing methods on CI-FAR10 and CIFAR100 datasets, which is valuable for guiding the selection of data rebalancing techniques. We also revisit existing methods and categorize them into four types including data balancing, feature balancing, loss balancing, and prediction balancing, according the focused procedure during the training pipeline.

show abstract

Onfocus detection: identifying individual-camera eye contact from unconstrained images

Cited by 15 publications

References 38 publications

Capsule Networks With Residual Pose Routing

Capsule Networks With Residual Pose Routing

SA-BiGCN: Bi-Stream Graph Convolution Networks With Spatial Attentions for the Eye Contact Detection in the Wild

Revisiting Long-tailed Image Classification: Survey and Benchmarks with New Evaluation Metrics

Contact Info

Product

Resources

About