3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN

Tao, Chongben; Jin, Yufeng; Cao, Feng; Zhang, Zufeng; Li, Chunguang; Gao, Hanwen

doi:10.1155/2020/5916205

Cited by 1 publication

(2 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A small FCN (Fully Convolutional Network) [40,41] is applied to each RoI to predict the segmented mask in a pixel-to-pixel manner. With its excellent performance, Mask R-CNN is popular in object detection, instance segmentation, and key-point detection tasks [42][43][44]. In this paper, we build a feline object detection model based on Mask R-CNN and then extract the object outline information.…”

Section: Construction Of the Outline Model 321 Outline Mask Rcnnmentioning

confidence: 99%

“…We computed the variation in the bending angles in a video sequence instead of a single image or adjacent frame. LSTM [52] is a variant of RNN (Recurrent Neural Networks) [53], which contains multiple LSTM cells. Each cell follows the ingenious gating mechanism (first, the forget gate decides what to discard in the previous cell state; then, the input gate updates information; and finally, the output gate transmits filtered information to the next cell state), which makes LSTMs capable of learning long-term dependencies.…”

Section: Action Identification Based On Skeletonmentioning

confidence: 99%

See 1 more Smart Citation

Action Recognition Using a Spatial-Temporal Network for Wild Felines

Feng

Zhao

Sun³

et al. 2021

Animals

View full text Add to dashboard Cite

Behavior analysis of wild felines has significance for the protection of a grassland ecological environment. Compared with human action recognition, fewer researchers have focused on feline behavior analysis. This paper proposes a novel two-stream architecture that incorporates spatial and temporal networks for wild feline action recognition. The spatial portion outlines the object region extracted by Mask region-based convolutional neural network (R-CNN) and builds a Tiny Visual Geometry Group (VGG) network for static action recognition. Compared with VGG16, the Tiny VGG network can reduce the number of network parameters and avoid overfitting. The temporal part presents a novel skeleton-based action recognition model based on the bending angle fluctuation amplitude of the knee joints in a video clip. Due to its temporal features, the model can effectively distinguish between different upright actions, such as standing, ambling, and galloping, particularly when the felines are occluded by objects such as plants, fallen trees, and so on. The experimental results showed that the proposed two-stream network model can effectively outline the wild feline targets in captured images and can significantly improve the performance of wild feline action recognition due to its spatial and temporal features.

show abstract