2017
DOI: 10.1007/s11063-017-9719-y
|View full text |Cite
|
Sign up to set email alerts
|

A Temporal Dependency Based Multi-modal Active Learning Approach for Audiovisual Event Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 50 publications
0
5
0
Order By: Relevance
“…After extracting features from CNNs, [28] applied three-layer deep neural network to fuse multimodal features. In addition, [29] used RNN to extract features and proposed a new multi-objective method to focus on specific parts containing the strong emotional information of audio data.…”
Section: Related Studiesmentioning
confidence: 99%
“…After extracting features from CNNs, [28] applied three-layer deep neural network to fuse multimodal features. In addition, [29] used RNN to extract features and proposed a new multi-objective method to focus on specific parts containing the strong emotional information of audio data.…”
Section: Related Studiesmentioning
confidence: 99%
“…The semi-automatic labels are generated by our data driven active learning approach, presented in [62,63]. The basic assumption of this approach is the sparseness of emotional reactions in the audio and video modalities.…”
Section: Data Annotationmentioning
confidence: 99%
“…High technical quality: The technical quality of the data and related signals is also checked and demonstrated via different preliminary classifications conducted on various subsets of the database including: the video data [63], the gesture data [65], the audio data [66], the biophysiological data [67], the speech and the biophysiological data [68], and the multimodal data [69].…”
mentioning
confidence: 99%
“…Multi-modal approaches on the other hand, are designed to perform an aggregation of a set of information stemming from multiple and heterogeneous modalities by applying a specific information fusion technique, in order to improve both the performance as well as the robustness of an inference system. Rather than relying on a single channel, an effective and smart combination of complementary information stemming from multiple channels mitigates the drawbacks specific to each single channel, while improving the generalization ability of the optimized inference system in comparison to one based on a single modality (Kächele et al, 2016 ; Bellmann et al, 2018 ; Thiam et al, 2018 ).…”
Section: Introductionmentioning
confidence: 99%