2003
DOI: 10.1007/3-540-45113-7_23
|View full text |Cite
|
Sign up to set email alerts
|

Detection of Documentary Scene Changes by Audio-Visual Fusion

Abstract: Abstract. The concept of a documentary scene was inferred from the audio-visual characteristics of certain documentary videos. It was observed that the amount of information from the visual component alone was not enough to convey a semantic context to most portions of these videos, but a joint observation of the visual component and the audio component conveyed a better semantic context. From the observations that we made on the video data, we generated an audio score and a visual score. We later generated a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0
2

Year Published

2003
2003
2017
2017

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 9 publications
0
4
0
2
Order By: Relevance
“…In the cited work, a Bayesian approach to determine biases of experts has been used to guide the fusion of audio and video streams. Fusion of audio and video modalities has been found to be useful for detection of documentary scene changes [26]. In [29], fusion of text and video has been proposed for story segmentation in news video.…”
Section: Multimedia Fusionmentioning
confidence: 99%
“…In the cited work, a Bayesian approach to determine biases of experts has been used to guide the fusion of audio and video streams. Fusion of audio and video modalities has been found to be useful for detection of documentary scene changes [26]. In [29], fusion of text and video has been proposed for story segmentation in news video.…”
Section: Multimedia Fusionmentioning
confidence: 99%
“…This approach treats the features as Ñ modalities, with features in the Ø modality ( ½ ¡ ¡ ¡ Ñ ). Most work in image and video retrieval analysis (e.g., [2,13,26,28,31]) employs this approach. For example, the QBIC system [13] supported image queries based on combining distances from the color and texture modalities.…”
mentioning
confidence: 99%
“…For example, the QBIC system [13] supported image queries based on combining distances from the color and texture modalities. Velivelli et al [31] separated video features into audio and visual modalities. IBM video analysis [2] also regarded each media track (visual, audio, textual, etc.)…”
mentioning
confidence: 99%
“…A cena é definida de maneiras distintas na literatura, por ser um conceito subjetivo. Velivelli et al (2003). Velivelli et al (2003), por exemplo, definem a cena como uma coleção de tomadas que são temporalmente unificadas ou que ocorrem em uma mesma localidade.…”
Section: Estrutura De Vídeounclassified
“…Velivelli et al (2003). Velivelli et al (2003), por exemplo, definem a cena como uma coleção de tomadas que são temporalmente unificadas ou que ocorrem em uma mesma localidade. No entanto, segundo Choi e Lee (2010), a definição correntemente aceita para cena é: um conjunto de tomadas que retrata uma única ideia, tema ou conceito, sem limitações de tempo ou espaço.…”
Section: Estrutura De Vídeounclassified