The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2021
DOI: 10.1007/s11760-021-01964-9
|View full text |Cite
|
Sign up to set email alerts
|

CNN-based camera motion classification using HSI color model for compressed videos

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…We feed this last feature map to a combination of depthwise and pointwise convolution layers to reduce the computational complexity, also known as Atrous Spatial Pyramid Pooling (ASPP). To control the receptive fields in depthwise convolution operations, we use different dilation rates (2,6,12,18,24) in the filters which helps us to extract semantic features at a multiscale. This dilation technique in depthwise convolution filters is also called the atrous separable convolution, a powerful tool, which is further described in the semantic high-level feature operations.…”
Section: ) Encoder-msc: Multi-scale Contextual Spatial Feature Extrac...mentioning
confidence: 99%
See 1 more Smart Citation
“…We feed this last feature map to a combination of depthwise and pointwise convolution layers to reduce the computational complexity, also known as Atrous Spatial Pyramid Pooling (ASPP). To control the receptive fields in depthwise convolution operations, we use different dilation rates (2,6,12,18,24) in the filters which helps us to extract semantic features at a multiscale. This dilation technique in depthwise convolution filters is also called the atrous separable convolution, a powerful tool, which is further described in the semantic high-level feature operations.…”
Section: ) Encoder-msc: Multi-scale Contextual Spatial Feature Extrac...mentioning
confidence: 99%
“…However, the CAMHID model only detected simple camera movements and did not identify complex camera movements. Therefore, Sandula et al [12] proposed a CNN-based camera motion classification model to classify 11 camera movement direction patterns using HSI (hue, saturation, intensity) color features. This method achieved an accuracy rate of 98.37% but did not capture the object motion descriptors.…”
Section: Introductionmentioning
confidence: 99%
“…CNNs have been widely used in the field of computer vision, such as image recognition [25][26][27] and video analysis [28,29].…”
Section: Cnn Modelmentioning
confidence: 99%
“…Finally, a time-series feature vector for classification trained the support vector machine (SVM). Sandula et al (2021) constructed a new camera motion classification framework based on the hue-saturation-intensity (HSI) model to compress block motion vectors. The designed framework sends the input to the inter-frame block motion vector decoded by the compressed stream to estimate its size and direction and assign the motion vector direction to hue and the motion vector size to saturation under a fixed Intensity.…”
Section: Introductionmentioning
confidence: 99%
“…After that, CNN was used for supervised learning to identify 11 camera motion modes that include seven pure camera motion modes and four hybrid camera modes. The results showed that the recognition accuracy of this method for 11 camera modes reached over 98% (Sandula et al, 2021). Rajesh and Muralidhara (2021) designed a reconstruction loss based on new driving and used the implicit multivariate Markov random field regularization method to enhance local details.…”
Section: Introductionmentioning
confidence: 99%