New approaches to audio-visual segmentation of TV news for automatic topic retrieval

Iurgel, U.; Meermeier, Ralf; Eickeler, Stefan; Rigoll, Gerhard

doi:10.1109/icassp.2001.941190

Cited by 8 publications

(9 citation statements)

References 2 publications

(3 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The internal structure of video has been modeled in literature to facilitate the detection of logical unit boundaries. Iurgel [14] proposed a video model for news shows, part of which is shown in Fig. 9.…”

Section: Segmentation Mechanismsmentioning

confidence: 99%

“…They consist of topic units with compilation and continuity cutting. Specific news models have been used to capture this type of structure [14,24]. Training and instructional videos focus on teaching and often an instructor is audibly and visibly addressing the audience of the video or interacting with an audience visible in the video.…”

Section: Data Domain and Logical Unit Typementioning

confidence: 99%

“…A continuation identifier, on the other hand, is exploited in the camera work rule, which states if three or more shots have the same camera work (other than still shot) they belong to the same logical unit. In [14] several approaches were presented for news story segmentation, one of which was based on rules. First an HMM was employed to classify shots into one of six classes: begin, end, newscaster, report, interview, and weather forecast.…”

Section: Segmentation Mechanismsmentioning

confidence: 99%

“…Approaches for news story segmentation usually exploit the fact that an introductory shot, usually an anchor shot, is followed by report shots for each news story. A very detailed model for news is proposed in [14]. It uses six content classes which are begin, end, newscaster, interview, report and weather forecast.…”

Section: Underlying Assumptionsmentioning

confidence: 99%

See 3 more Smart Citations

<title>Logical unit and scene detection: a comparative survey</title>

Petersohn

2008

SPIE Proceedings

View full text Add to dashboard Cite

Logical units are semantic video segments above the shot level. Depending on the common semantics within the unit and data domain, different types of logical unit extraction algorithms have been presented in literature. Topic units are typically extracted for documentaries or news broadcasts while scenes are extracted for narrative-driven video such as feature films, sitcoms, or cartoons. Other types of logical units are extracted from home video and sports. Different algorithms in literature used for the extraction of logical units are reviewed in this paper based on the categories unit type, data domain, features used, segmentation method, and thresholds applied. A detailed comparative study is presented for the case of extracting scenes from narrative-driven video. While earlier comparative studies focused on scene segmentation methods only or on complete news-story segmentation algorithms, in this paper various visual features and segmentation methods with their thresholding mechanisms and their combination into complete scene detection algorithms are investigated. The performance of the resulting large set of algorithms is then evaluated on a set of video files including feature films, sitcoms, children's shows, a detective story, and cartoons.

show abstract

Section: Segmentation Mechanismsmentioning

confidence: 99%

Section: Data Domain and Logical Unit Typementioning

confidence: 99%

Section: Segmentation Mechanismsmentioning

confidence: 99%

Section: Underlying Assumptionsmentioning

confidence: 99%

See 2 more Smart Citations

<title>Logical unit and scene detection: a comparative survey</title>

Petersohn

2008

SPIE Proceedings

View full text Add to dashboard Cite

show abstract

“…Esse fato pode ser explicado pela capacidade da parte visual de transmitir uma grande parte da semântica latente presente em um vídeo, comprovada em vários trabalhos (Fabro e Böszörmenyi, 2013;Coimbra, 2011;Iurgel et al, 2001). …”

Section: Segmentação Em Cenas Com Descritores Visuaisunclassified

Detecção de cenas em segmentos semanticamente complexos

Lopes¹

View full text Add to dashboard Cite

Dedico esse trabalho a meus pais e à minha noiva, que sempre me apoiaram e ajudaram em todos os momentos. AgradecimentosAgradeço em primeiro lugar a Deus, por ter me iluminado durante todo o desenvolvimento do trabalho, dando a paciência e a inspiração necessária para sua realização.Agradeço também, a meu orientador, pelos infindáveis conselhos e por sua orientação sempre tão pertinente.Agradeço aos professores das matérias realizadas no mestrado, que certamente contribuíram beneficamente para a realização dessa pesquisa.Agradeço aos colegas e amigos do laboratório de pesquisa, que sempre me apoiaram e me deram forças nos momentos de desânimo.Agradeço ao CNPq pelo auxílio financeiro, processo n°134245/2011-3. Agradeço à FAPESP pelo auxílio financeiro, processo n°2011/05238-0, Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP). "As opiniões, hipóteses e conclusões ou recomendações expressas neste material são de responsabilidade do(s) autor(es) e não necessariamente refletem a visão da FAPESP". Resumo Diversas áreas da Computação (Personalização e Adaptação de Conteúdo, Recuperação de Informação, entre outras) se beneficiam da segmentação de vídeo em unidades menores de informação. A literatura apresenta diversos métodos e técnicas cujo objetivo é identificar essas unidades. Uma limitação é que tais técnicas não tratam o problema da detecção de cenas em segmentos semanticamente complexos, definidos como trechos de vídeo que apresentam mais de um assunto ou tema, e cuja semân-tica latente dificilmente pode ser determinada utilizando-se somente uma única mídia. Esses segmentos são muito relevantes, pois estão presentes em diversos domínios de vídeo, tais como filmes, noticiários e mesmo comerciais.A presente Dissertação de Mestrado propõe uma técnica de segmentação de vídeo capaz de identificar cenas em segmentos semanticamente complexos. Para isso utiliza a semântica latente alcançada com o uso de Bag of Visual Words para agrupar os segmentos de um vídeo. O agrupamento é baseado em multimodalidade, analisando-se características visuais e sonoras de cada vídeo e combinando-se os resultados por meio da estratégia fusão tardia. O presente trabalho demonstra a viabilidade técnica em reconhecer cenas em segmentos semanticamente complexos. The literature reports lots of techniques and methods, whose goal is to identify these units. One of these techniques' limitations is that they don't handle scene detection in semantically complex segments, which are defined as video snippets that present more than one subject or theme, whose latent semantics can hardly be determined using only one media. Those segments are very relevant, since they are present in multiple video domains as movies, news and even television commercials. This Master's dissertation proposes a video scene segmentation technique able to detect scenes in semantically complex segments. In order to achieve this goal it uses latent semantics extracted by the Bag of Visual Words to group a video segments. This grouping process is based on multimodalit...

show abstract

The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information

Rigoll¹

IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.

View full text Add to dashboard Cite

New approaches to audio-visual segmentation of TV news for automatic topic retrieval

Cited by 8 publications

References 2 publications

<title>Logical unit and scene detection: a comparative survey</title>

<title>Logical unit and scene detection: a comparative survey</title>

Detecção de cenas em segmentos semanticamente complexos

The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information

Contact Info

Product

Resources

About