2018
DOI: 10.1007/978-3-030-01225-0_24
|View full text |Cite
|
Sign up to set email alerts
|

Deep Discriminative Model for Video Classification

Abstract: This paper presents a new deep learning approach for videobased scene classification. We design a Heterogeneous Deep Discriminative Model (HDDM) whose parameters are initialized by performing an unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBM). In order to avoid the redundancy of adjacent frames, we extract spatiotemporal variation patterns within frames and represent them sparsely using Sparse Cubic Symmetrical Pattern (SCSP). Then, a pre-initialized HDDM i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 46 publications
0
4
0
1
Order By: Relevance
“…Accuracy CSO [7] 67.7 SFA [28] 60.0 SOE [8] 43.1 BoSE [8] 77.7 C3D [29] 87.7 st-TCoF [19] 88.4 DDM+SCSP [27] 90.3 LSTF [13] 95.0 DI (4 stream) [2] 92.5 SVMP [35] 90.1 AWSD (ResNet-50) 97.5 AWSD (ResNet-101) 98.1 still images. This technique tackles the problem of tuning huge number of parameters in deep models for videos.…”
Section: Methodsmentioning
confidence: 99%
“…Accuracy CSO [7] 67.7 SFA [28] 60.0 SOE [8] 43.1 BoSE [8] 77.7 C3D [29] 87.7 st-TCoF [19] 88.4 DDM+SCSP [27] 90.3 LSTF [13] 95.0 DI (4 stream) [2] 92.5 SVMP [35] 90.1 AWSD (ResNet-50) 97.5 AWSD (ResNet-101) 98.1 still images. This technique tackles the problem of tuning huge number of parameters in deep models for videos.…”
Section: Methodsmentioning
confidence: 99%
“…The maximum value of TF-IDF of extracted features (FCall_TF-IDF) from collection of comments Call, denoted by MaxFC, as defined in (9).…”
Section: Mathematical Formula and Proposed Algorithmsmentioning
confidence: 99%
“…Previously, different versions of encoder-decoder networks have been widely used for unsupervised feature learning [23]. After training these networks, the encoder can map the input data into a latent representation.…”
Section: Encoder Network: Video Distillationmentioning
confidence: 99%
“…These challenges drastically degrade the performance of video analysis methods. In the past years, a substantial number of approaches have been introduced to cope with these challenges [23,24,4,32,25]. Preliminary works treated videos as either sequences of still images or volumetric objects, and applied handcrafted local descriptors on a stack of images [29,33].…”
Section: Introductionmentioning
confidence: 99%

AVD: Adversarial Video Distillation

Tavakolian,
Sabokrou,
Hadid
2019
Preprint
Self Cite