CNN-based camera motion classification using HSI color model for compressed videos

Sandula, Pavan; Kolanu, Harish Reddy; Okade, Manish

doi:10.1007/s11760-021-01964-9

Cited by 6 publications

(7 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We feed this last feature map to a combination of depthwise and pointwise convolution layers to reduce the computational complexity, also known as Atrous Spatial Pyramid Pooling (ASPP). To control the receptive fields in depthwise convolution operations, we use different dilation rates (2,6,12,18,24) in the filters which helps us to extract semantic features at a multiscale. This dilation technique in depthwise convolution filters is also called the atrous separable convolution, a powerful tool, which is further described in the semantic high-level feature operations.…”

Section: ) Encoder-msc: Multi-scale Contextual Spatial Feature Extrac...mentioning

confidence: 99%

“…However, the CAMHID model only detected simple camera movements and did not identify complex camera movements. Therefore, Sandula et al [12] proposed a CNN-based camera motion classification model to classify 11 camera movement direction patterns using HSI (hue, saturation, intensity) color features. This method achieved an accuracy rate of 98.37% but did not capture the object motion descriptors.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Semantic Scene Object-Camera Motion Recognition for Scene Transition Detection Using Dense Spatial Frame Segments and Temporal Trajectory Analysis

Chakraborty,

Chiracharit,

Chamnongthai

2024

IEEE Access

View full text Add to dashboard Cite

Semantic video scene-understanding applications rely on object-camera motion recognition techniques for scene contextual movement representation. While existing machine learning-based methods perform efficiently, their primary limitation is to analyze motion patterns from normal frames only, neglecting the scene transition frames. This causes significant false alarms due to the undetected objectcamera motion patterns during scene transitions. In this paper, we propose a novel method for object and camera motion recognition of two consecutive scenes from their transition frames. First, our method detects cut transitions using principal component analysis (PCA) to segment the video into shots. Additionally, it eliminates large text transitions that are often falsely detected as cut transitions using structural similarity index measurement (SSIM) properties. Second, it selects candidate segments to localize normal and wipe transition frames using slope angle characteristics obtained from linear regression. Third, it extracts dense semantic spatial features at multi-scale using the modified DeepLabv3+ network to segment selected candidate frames into foreground, background, and wipe pixels. Finally, an optical flow algorithm-based temporal trajectory tracking model is applied on each segmented pixel to recognize the object, camera pan, zoom-in, and zoom-out motion patterns. We further remove falsely detected non-transition motion frames to improve wipe transition detection. The experimental results are obtained using the benchmark TRECVID and the multimedia datasets. The proposed method using pixel-level classification and temporal trajectory analysis achieved an average accuracy improvement of 9.28% for object-camera motion recognition, 3.75% for cut transition detection, and 3.01% for wipe transition detection.

show abstract

Section: ) Encoder-msc: Multi-scale Contextual Spatial Feature Extrac...mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Semantic Scene Object-Camera Motion Recognition for Scene Transition Detection Using Dense Spatial Frame Segments and Temporal Trajectory Analysis

Chakraborty,

Chiracharit,

Chamnongthai

2024

IEEE Access

View full text Add to dashboard Cite

show abstract

“…CNNs have been widely used in the field of computer vision, such as image recognition [25][26][27] and video analysis [28,29].…”

Section: Cnn Modelmentioning

confidence: 99%

A Novel Lightweight Anonymous Proxy Traffic Detection Method Based on Spatio-Temporal Features

2022

Sensors

View full text Add to dashboard Cite

Anonymous proxies are used by criminals for illegal network activities due to their anonymity, such as data theft and cyber attacks. Therefore, anonymous proxy traffic detection is very essential for network security. In recent years, detection based on deep learning has become a hot research topic, since deep learning can automatically extract and select traffic features. To make (heterogeneous) network traffic adapt to the homogeneous input of typical deep learning algorithms, a major branch of existing studies convert network traffic into images for detection. However, such studies are commonly subject to the limitation of large-sized image representation of network traffic, resulting in very large storage and computational resource overhead. To address this limitation, a novel method for anonymous proxy traffic detection is proposed. The method is one of the solutions to reduce storage and computational resource overhead. Specifically, it converts the sequences of the size and inter-arrival time of the first N packets of a flow into images, and then categorizes the converted images using the one-dimensional convolutional neural network. Both proprietary and public datasets are used to validate the proposed method. The experimental results show that the converted images of the method are at least 90% smaller than that of existing image-based deep learning methods. With substantially smaller image sizes, the method can still achieve F1 scores up to 98.51% in Shadowsocks traffic detection and 99.8% in VPN traffic detection.

show abstract

“…Finally, a time-series feature vector for classification trained the support vector machine (SVM). Sandula et al (2021) constructed a new camera motion classification framework based on the hue-saturation-intensity (HSI) model to compress block motion vectors. The designed framework sends the input to the inter-frame block motion vector decoded by the compressed stream to estimate its size and direction and assign the motion vector direction to hue and the motion vector size to saturation under a fixed Intensity.…”

Section: Introductionmentioning

confidence: 99%

“…After that, CNN was used for supervised learning to identify 11 camera motion modes that include seven pure camera motion modes and four hybrid camera modes. The results showed that the recognition accuracy of this method for 11 camera modes reached over 98% (Sandula et al, 2021). Rajesh and Muralidhara (2021) designed a reconstruction loss based on new driving and used the implicit multivariate Markov random field regularization method to enhance local details.…”

Section: Introductionmentioning

confidence: 99%

Application of Deep Convolution Network Algorithm in Sports Video Hot Spot Detection

et al. 2022

View full text Add to dashboard Cite

Sports videos are blowing up over the internet with enriching material life and the higher pursuit of spiritual life of people. Thus, automatically identifying and detecting helpful information from videos have arisen as a relatively novel research direction. Accordingly, the present work proposes a Human Pose Estimation (HPE) model to automatically classify sports videos and detect hot spots in videos to solve the deficiency of traditional algorithms. Firstly, Deep Learning (DL) is introduced. Then, amounts of human motion features are extracted by the Region Proposal Network (RPN). Next, an HPE model is implemented based on Deep Convolutional Neural Network (DCNN). Finally, the HPE model is applied to motion recognition and video classification in sports videos. The research findings corroborate that an effective and accurate HPE model can be implemented using the DCNN to recognize and classify videos effectively. Meanwhile, Big Data Technology (BDT) is applied to count the playing amounts of various sports videos. It is convinced that the HPE model based on DCNN can effectively and accurately classify the sports videos and then provide a basis for the following statistics of various sports videos by BDT. Finally, a new outlook is proposed to apply new technology in the entertainment industry.

show abstract

CNN-based camera motion classification using HSI color model for compressed videos

Cited by 6 publications

References 17 publications

Semantic Scene Object-Camera Motion Recognition for Scene Transition Detection Using Dense Spatial Frame Segments and Temporal Trajectory Analysis

Semantic Scene Object-Camera Motion Recognition for Scene Transition Detection Using Dense Spatial Frame Segments and Temporal Trajectory Analysis

A Novel Lightweight Anonymous Proxy Traffic Detection Method Based on Spatio-Temporal Features

Application of Deep Convolution Network Algorithm in Sports Video Hot Spot Detection

Contact Info

Product

Resources

About