Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition

Min, Shaobo; Yao, Hantao; Xie, Hongtao; Zha, Zheng-Jun; Zhang, Yongdong

doi:10.1109/tip.2020.2977457

Cited by 66 publications

(22 citation statements)

References 64 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They evaluated their model using realistic datasets such as HMDB51 [23] and UCF101 and showed that the HC-MTL method outperformed other methods for both action grouping and recognition. In addition, Min et al [24] proposed a Multi-Objective Matrix Normalization (MOMN) method for fine-grained visual recognition. Their proposed system can simultaneously normalize a bilinear representation in square-root, low-rank, and sparsity.…”

Section: Literature Reviewmentioning

confidence: 99%

Deep Neural Network for Slip Detection on Ice Surface

Fernie

et al. 2020

Sensors

View full text Add to dashboard Cite

Slip-induced falls are among the most common causes of major occupational injuries and economic loss in Canada. Identifying the risk factors associated with slip events is key to developing preventive solutions to reduce falls. One factor is the slip-resistance quality of footwear, which is fundamental to reducing the number of falls. Measuring footwear slip resistance with the recently developed Maximum Achievable Angle (MAA) test requires a trained researcher to identify slip events in a simulated winter environment. The human capacity for information processing is limited and human error is natural, especially in a cold environment. Therefore, to remove conflicts associated with human errors, in this paper a deep three-dimensional convolutional neural network is proposed to detect the slips in real-time. The model has been trained by a new dataset that includes data from 18 different participants with various clothing, footwear, walking directions, inclined angles, and surface types. The model was evaluated on three types of slips: Maxi-slip, midi-slip, and mini-slip. This classification is based on the slip perception and recovery of the participants. The model was evaluated based on both 5-fold and Leave-One-Subject-Out (LOSO) cross validation. The best accuracy of 97% was achieved when identifying the maxi-slips. The minimum accuracy of 77% was achieved when classifying the no-slip and mini-slip trials. The overall slip detection accuracy was 86% with sensitivity and specificity of 81% and 91%, respectively. The overall accuracy dropped by about 2% in LOSO cross validation. The proposed slip detection algorithm is not only beneficial for footwear manufactures to improve their footwear slip resistance quality, but it also has other potential applications, such as improving the slip resistance properties of flooring in healthcare facilities, commercial kitchens, and oil drilling platforms.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

Deep Neural Network for Slip Detection on Ice Surface

Fernie

et al. 2020

Sensors

View full text Add to dashboard Cite

show abstract

“…iSQRT-COV [101] and the improved B-CNN [146] used the Newton-Schulz iteration to approximate matrix square-root normalization with only matrix multiplication to decrease training time. Recently, MOMN [106] was proposed to simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity all within a multi-objective optimization framework.…”

Section: Performing High-order Feature Interactionsmentioning

confidence: 99%

Fine-Grained Image Analysis With Deep Learning: A Survey

Wei

Song

Aodha

et al. 2022

IEEE Trans. Pattern Anal. Mach. Intell.

166

View full text Add to dashboard Cite

Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications. The task of FGIA targets analyzing visual objects from subordinate categories, e.g., species of birds or models of cars. The small inter-class and large intra-class variation inherent to fine-grained image analysis makes it a challenging problem. Capitalizing on advances in deep learning, in recent years we have witnessed remarkable progress in deep learning powered FGIA. In this paper we present a systematic survey of these advances, where we attempt to re-define and broaden the field of FGIA by consolidating two fundamental fine-grained research areas -fine-grained image recognition and fine-grained image retrieval. In addition, we also review other key issues of FGIA, such as publicly available benchmark datasets and related domain-specific applications. We conclude by highlighting several research directions and open problems which need further exploration from the community.

show abstract

“…All the neural network models are implemented using the PyTorch framework. We compare various methods, including deep networks (e.g., DenseNet161 [55] and SENet [56]), recently proposed fine-grained recognition methods (e.g., MOMN [57] and PMG [58]) and food recognition methods (e.g., PAR-Net [9]). For the deep networks, we train all the networks with parameters initialized from ImageNet pretrained weights with a learning rate of 10 −2 , and divided by 10 after 30 epoches.…”

Section: Implementation Detailsmentioning

confidence: 99%

Large Scale Visual Food Recognition

Min¹,

Wang²,

Liu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Food recognition plays an important role in food choice and intake, which is essential to the health and well-being of humans. It is thus of importance to the computer vision community, and can further support many food-oriented vision and multimodal tasks, e.g., food detection and segmentation, cross-modal recipe retrieval and generation. Unfortunately, we have witnessed remarkable advancements in generic visual recognition for released large-scale datasets, yet largely lags in the food domain. In this paper, we introduce Food2K, which is the largest food recognition dataset with 2,000 categories and over 1 million images. Compared with existing food recognition datasets, Food2K bypasses them in both categories and images by one order of magnitude, and thus establishes a new challenging benchmark to develop advanced models for food visual representation learning. Furthermore, we propose a deep progressive region enhancement network for food recognition, which mainly consists of two components, namely progressive local feature learning and region feature enhancement. The former adopts improved progressive training to learn diverse and complementary local features, while the latter utilizes self-attention to incorporate richer context with multiple scales into local features for further local feature enhancement. Extensive experiments on Food2K demonstrate the effectiveness of our proposed method. More importantly, we have verified better generalization ability of Food2K in various tasks, including food image recognition, food image retrieval, cross-modal recipe retrieval, food detection and segmentation. Food2K can be further explored to benefit more food-relevant tasks including emerging and more complex ones (e.g., nutritional understanding of food), and the trained models on Food2K can be expected as backbones to improve the performance of more food-relevant tasks. We also hope Food2K can serve as a large scale fine-grained visual recognition benchmark, and contributes to the development of large scale fine-grained visual analysis.

show abstract

Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition

Cited by 66 publications

References 64 publications

Deep Neural Network for Slip Detection on Ice Surface

Deep Neural Network for Slip Detection on Ice Surface

Fine-Grained Image Analysis With Deep Learning: A Survey

Large Scale Visual Food Recognition

Contact Info

Product

Resources

About