Deep Discriminative Model for Video Classification

Tavakolian, Mohammad; Hadid, Abdenour

doi:10.1007/978-3-030-01225-0_24

Cited by 8 publications

(5 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Accuracy CSO [7] 67.7 SFA [28] 60.0 SOE [8] 43.1 BoSE [8] 77.7 C3D [29] 87.7 st-TCoF [19] 88.4 DDM+SCSP [27] 90.3 LSTF [13] 95.0 DI (4 stream) [2] 92.5 SVMP [35] 90.1 AWSD (ResNet-50) 97.5 AWSD (ResNet-101) 98.1 still images. This technique tackles the problem of tuning huge number of parameters in deep models for videos.…”

Section: Methodsmentioning

confidence: 99%

AWSD: Adaptive Weighted Spatiotemporal Distillation for Video Representation

Tavakolian

Tavakoli

Hadid³

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

View full text Add to dashboard Cite

We propose an Adaptive Weighted Spatiotemporal Distillation (AWSD) technique for video representation by encoding the appearance and dynamics of the videos into a single RGB image map. This is obtained by adaptively dividing the videos into small segments and comparing two consecutive segments. This allows using pre-trained models on still images for video classification while successfully capturing the spatiotemporal variations in the videos. The adaptive segment selection enables effective encoding of the essential discriminative information of untrimmed videos. Based on Gaussian Scale Mixture, we compute the weights by extracting the mutual information between two consecutive segments. Unlike pooling-based methods, our AWSD gives more importance to the frames that characterize actions or events thanks to its adaptive segment length selection. We conducted extensive experimental analysis to evaluate the effectiveness of our proposed method and compared our results against those of recent state-of-the-art methods on four benchmark datatsets, including UCF101, HMDB51, and Maryland. The obtained results on these benchmark datatsets showed that our method significantly outperforms earlier works and sets the new state-of-the-art performance in video classification. Code is available at the project webpage: https://mohammadt68.github. io/AWSD/

show abstract

Section: Methodsmentioning

confidence: 99%

AWSD: Adaptive Weighted Spatiotemporal Distillation for Video Representation

Tavakolian

Tavakoli

Hadid³

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

View full text Add to dashboard Cite

show abstract

“…The maximum value of TF-IDF of extracted features (FCall_TF-IDF) from collection of comments Call, denoted by MaxFC, as defined in (9).…”

Section: Mathematical Formula and Proposed Algorithmsmentioning

confidence: 99%

Enhancing multi-class web video categorization model using machine and deep learning approaches

Yafooz

Alsaeedi

Alluhaibi

et al. 2022

IJECE

View full text Add to dashboard Cite

<p><span>With today’s digital revolution, many people communicate and collaborate in cyberspace. Users rely on social media platforms, such as Facebook, YouTube and Twitter, all of which exert a considerable impact on human lives. In particular, watching videos has become more preferable than simply browsing the internet because of many reasons. However, difficulties arise when searching for specific videos accurately in the same domains, such as entertainment, politics, education, video and TV shows. This problem can be solved through web video categorization (WVC) approaches that utilize video textual information, visual features, or audio approaches. However, retrieving or obtaining videos with similar content with high accuracy is challenging. Therefore, this paper proposes a novel mode for enhancing WVC that is based on user comments and weighted features from video descriptions. Specifically, this model uses supervised learning, along with machine learning classifiers (MLCs) and deep learning (DL) models. Two experiments are conducted on the proposed balanced dataset on the basis of the two proposed algorithms based on multi-classes, namely, education, politics, health and sports. The model achieves high accuracy rates of 97% and 99% by using MLCs and DL models that are based on artificial neural network (ANN) and long short-term memory (LSTM), respectively.</span></p>

show abstract

“…Previously, different versions of encoder-decoder networks have been widely used for unsupervised feature learning [23]. After training these networks, the encoder can map the input data into a latent representation.…”

Section: Encoder Network: Video Distillationmentioning

confidence: 99%

“…These challenges drastically degrade the performance of video analysis methods. In the past years, a substantial number of approaches have been introduced to cope with these challenges [23,24,4,32,25]. Preliminary works treated videos as either sequences of still images or volumetric objects, and applied handcrafted local descriptors on a stack of images [29,33].…”

Section: Introductionmentioning

confidence: 99%

AVD: Adversarial Video Distillation

Tavakolian,

Sabokrou,

Hadid

2019

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, we present a simple yet efficient approach for video representation, called Adversarial Video Distillation (AVD). The key idea is to represent videos by compressing them in the form of realistic images, which can be used in a variety of video-based scene analysis applications. Representing a video as a single image enables us to address the problem of video analysis by image analysis techniques. To this end, we exploit a 3D convolutional encoder-decoder network to encode the input video as an image by minimizing the reconstruction error. Furthermore, weak supervision by an adversarial training procedure is imposed on the output of the encoder to generate semantically realistic images. The encoder learns to extract semantically meaningful representations from a given input video by mapping the 3D input into a 2D latent representation. The obtained representation can be simply used as the input of deep models pre-trained on images for video classification. We evaluated the effectiveness of our proposed method for video-based activity recognition on three standard and challenging benchmark datasets, i.e. UCF101, HMDB51, and Kinetics. The experimental results demonstrate that AVD achieves interesting performance, outperforming the state-of-the-art methods for video classification.

show abstract

Deep Discriminative Model for Video Classification

Cited by 8 publications

References 46 publications

AWSD: Adaptive Weighted Spatiotemporal Distillation for Video Representation

AWSD: Adaptive Weighted Spatiotemporal Distillation for Video Representation

Enhancing multi-class web video categorization model using machine and deep learning approaches

AVD: Adversarial Video Distillation

Contact Info

Product

Resources

About