Discriminative model fusion for semantic concept detection and annotation in video

Iyengar, G.; Nock, Harriet J.

doi:10.1145/957013.957065

Cited by 42 publications

(24 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The subset is used in the evaluation of high-level feature extraction task in TRECVID 2006 2 . The names and the identity number (ID) of the semantic concepts are listed in Table 1.…”

Section: Extracting Concept Associationmentioning

confidence: 99%

“…Thus, N classifier are trained, each for one concept. Following the terms in [2], each classifier is treated as a basis model which plays a similar role as the eigenvector in the eigenspace, and it maps the low-level feature into one component in model score space. Traditionally, the N basis models are equally treated, and the feature dimension in the MBT fusion will be N. In Figure 1, the association of a target concept with other concepts varies in a large range from one concept to another.…”

Section: Exploiting Concept Association For Indexingmentioning

confidence: 99%

“…Suppose we have M types of features and N concepts, M*N classifiers should be trained. These classifiers are treated as the bases to map a training sample into M*N-dimensional model score space [1,2,7]. Based on the model score space representation, another classifier is trained for each concept to reach the final classification decision.…”

Section: Introductionmentioning

confidence: 99%

“…In the classification stage, a test sample is first mapped into a M*Ndimensional model space vector, and then the final decision is made using the classifier trained on the model score space. This scheme has been proven successful by all systems developed for multimedia semantic concept detection and search in TRECVID 1 [1,2] (see TRECVID workshop papers for details).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Exploiting Concept Association to Boost Multimedia Semantic Concept Detection

Gao

Zhu

Sun

2007

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07

View full text Add to dashboard Cite

In the paper we study the efficiency of semantic concept association in multimedia semantic concept detection. We present an approach to automatically learn from the corpus the association strength between pair-wise semantic concepts. We discuss two usages of association strength: 1) applying positive concepts with high association strength for selecting expressive component in the model-based fusion and 2) applying negative concepts with low association strength as filters. We evaluate its efficiency on the task of semantic concept detection on the large-scale news video dataset from TRECVID 2005 development set. Our experimental results demonstrate that exploiting positive association reduces the size of feature dimension in the modelbased fusion and significantly improves the rank performance of system. The mean average precision is increased to 0. 215 on the validation set and 0.206 on the evaluation set. Compared to the traditional model-based fusion, the improvement is about 9.1% and 3.5%, respectively. The average feature dimension is reduced to 43 from 312.

show abstract

“…The subset is used in the evaluation of high-level feature extraction task in TRECVID 2006 2 . The names and the identity number (ID) of the semantic concepts are listed in Table 1.…”

Section: Extracting Concept Associationmentioning

confidence: 99%

Section: Exploiting Concept Association For Indexingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Exploiting Concept Association to Boost Multimedia Semantic Concept Detection

Gao

Zhu

Sun

2007

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07

View full text Add to dashboard Cite

show abstract

“…Существует множество приложений, в которых производится объединение аудио и видео, такие как распознавание речи [7][8][9][10][11][12][13], распознавание диктора [14][15][16], биометрическая верификация [17][18][19][20][21], обнаружение события [22][23][24][25], слежение за человеком или объектом [26][27][28][29][30][31], локализация и слежение за активным диктором [32,33], анализ музыкального контента, распознавание эмоций, видеопоиск, челове-ко-машинное взаимодействие, обнаружение голосовой активности и разделение источников звукового сигнала [34][35][36]. Очевидно, что в некоторых приложениях используются изображения лиц, а иногда даже движения всего тела, а не только лица.…”

Section: Introductionunclassified

Analysis of multimodal fusion techniques for audio-visual speech recognition

Ivanko¹,

Kipyatkova²,

Ronzhin³

et al. 2016

Naučno-teh. vestn. inf. tehnol. meh. opt.

View full text Add to dashboard Cite

Harmonium Models for Video Classification

Yang

Yan²,

Liu³

et al. 2008

Statistical Analysis

View full text Add to dashboard Cite

in Wiley InterScience (www.interscience.wiley.com).Abstract: Accurate and efficient video classification demands the fusion of multimodal information and the use of intermediate representations. Combining the two ideas into one framework, we propose a series of probabilistic models for video representation and classification using intermediate semantic representations derived from multimodal features of video. On the basis of a class of bipartite undirected graphical models named harmonium, we propose dual-wing harmonium (DWH) model that represents video shots as latent semantic topics derived by jointly modeling the transcript keywords and color-histogram features of the data. Our family-of-harmonium (FoH) and hierarchical harmonium (HH) model extends DWH by introducing variables representing category labels of data, which allows data representation and classification to be performed in the same model. Our models are among the few attempts of using undirected graphical models for representing and classifying video data. Experiments on a benchmark video collection show different semantic interpretations of video data under our models, as well as superior classification performance in comparison with several directed models. 

show abstract

Discriminative model fusion for semantic concept detection and annotation in video

Cited by 42 publications

References 9 publications

Exploiting Concept Association to Boost Multimedia Semantic Concept Detection

Exploiting Concept Association to Boost Multimedia Semantic Concept Detection

Analysis of multimodal fusion techniques for audio-visual speech recognition

Harmonium Models for Video Classification

Contact Info

Product

Resources

About