2021
DOI: 10.3390/s21062045
|View full text |Cite
|
Sign up to set email alerts
|

Detection of Important Scenes in Baseball Videos via a Time-Lag-Aware Multimodal Variational Autoencoder

Abstract: A new method for the detection of important scenes in baseball videos via a time-lag-aware multimodal variational autoencoder (Tl-MVAE) is presented in this paper. Tl-MVAE estimates latent features calculated from tweet, video, and audio features extracted from tweets and videos. Then, important scenes are detected by estimating the probability of the scene being important from estimated latent features. It should be noted that there exist time-lags between tweets posted by users and videos. To consider the ti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 20 publications
0
7
0
Order By: Relevance
“…The encoder, decoder, and important scene predictor consisted of multiple fully connected layers, respectively. Unlike the conventional method [ 14 ] which trains a part of the encoder considering time lags, Tl-LVM can train the entire network considering time lags based on the new loss function. Therefore, this figure does not represent a time-lag aware transformation in the encoder as in the figure of the conventional method.…”
Section: Figurementioning
confidence: 99%
See 4 more Smart Citations
“…The encoder, decoder, and important scene predictor consisted of multiple fully connected layers, respectively. Unlike the conventional method [ 14 ] which trains a part of the encoder considering time lags, Tl-LVM can train the entire network considering time lags based on the new loss function. Therefore, this figure does not represent a time-lag aware transformation in the encoder as in the figure of the conventional method.…”
Section: Figurementioning
confidence: 99%
“…The method in [ 22 ] uses a multimodal variational autoencoder (MVAE) [ 24 ] using tweets and visual information as the latent variable model (LVM) to detect fake news. The effectiveness of using LVM to consider both videos and tweets has also been reported in [ 14 , 16 ]. The MVAE can discover correlations between modalities by learning shared representations.…”
Section: Introductionmentioning
confidence: 98%
See 3 more Smart Citations