Revealing the connoted visual code: a new approach to video classification

Cavet, René; Volmer, Stephan; Leopold, Edda; Kindermann, Jörg; Paaß, Gerhard

doi:10.1016/j.cag.2004.03.002

Cited by 4 publications

(5 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Each cluster representative (typically the centroid) is considered as a visual word of the visual dictionary. The K-means clustering algorithm [1,4] is the most common method to create such visual dictionaries even though other unsupervised methods such as K-median clustering [5], mean-shift clustering [6], hierarchical K-means [7], agglomerative clustering [8], radius basedclustering [6,9], or regular lattice-based strategies [10] have also been used. One of the common features of these unsupervised methods is that they only optimize an objective function fitting to the data but ignoring their class information.…”

Section: Introductionmentioning

confidence: 99%

Supervised learning of Gaussian mixture models for visual vocabulary generation

Fernando¹,

Fromont²,

Muselet³

et al. 2012

Pattern Recognition

View full text Add to dashboard Cite

Section: Introductionmentioning

confidence: 99%

Supervised learning of Gaussian mixture models for visual vocabulary generation

Fernando¹,

Fromont²,

Muselet³

et al. 2012

Pattern Recognition

View full text Add to dashboard Cite

“…e.g., [17]. Also related, is work on video classification that makes use of connoted visual codes [18]. The output of the approach is a class label that can be used as tag for annotating the video.…”

Section: Tags Derived From Audiovisual Contentmentioning

confidence: 99%

The participation payoff

Ramzan

Larson

Dufaux

et al. 2010

Proceedings of the International Conference on Multimedia Information Retrieval

View full text Add to dashboard Cite

Increasingly, multimedia collections are associated with networked communities consisting of interconnected groups of users who create, annotate, browse, search, share, view, critique and remix collection content. Information arises within networked communities via connections among users and in the course of interactions between users and content. Community-derived information can be exploited to improve user access to multimedia. This paper provides a survey of techniques that make use of a combination of three information sources: communitycontributed information (e.g., tags and ratings), network structure and techniques for multimedia content analysis. This triple synergy offers a wide range of opportunities for improving access to multimedia in networked communities. We focus our survey on three areas important for multimedia access: annotation, distribution and retrieval. The picture that emerges is promising: information derived from the social community is remarkably effective in improving access to multimedia content, and participation in networked communities has a high payoff.

show abstract

“…Generally, cluster centers are considered as visual words. They are usually extracted by K-means based-algorithms [1,3] even though other approaches have been applied, such as k-median clustering [4], mean-shift clustering [5], hierarchi-cal K-means [6], agglomerative clustering [7], randomized trees [8], radius based-clustering [5,9], or regular lattice-base strategies [10]. Thanks to the vector quantization, a given image can then be mapped into this new space of visual words leading to a bag of visual words, where each word can be weighted either according to its frequency or using more sophisticated techniques.…”

Section: Introductionmentioning

confidence: 99%

Accurate visual word construction using a supervised approach

Fernando

Fromont²,

Muselet³

et al. 2010

2010 25th International Conference of Image and Vision Computing New Zealand

View full text Add to dashboard Cite

Most of the bag of visual words models are used to resorting to clustering techniques such as the K-means algorithm, to construct visual dictionaries. In order to improve their efficiency in the context of multiclass image classification tasks, we present in this paper a new incremental weighted average and gradient descent-based clustering algorithm which optimizes the visual word detection by the use of the class label of training examples. We show that this new supervised vector quantization allows us to better reveal concept or category-specific local feature distributions over the feature space. A large comparison with the standard K-means algorithm on the PASCAL VOC-2007 dataset is carried out. The results show that our visual word construction technique is much more suitable for learning efficient classifiers with Support Vector Machine and Random Forest algorithms.

show abstract

Revealing the connoted visual code: a new approach to video classification

Cited by 4 publications

References 6 publications

Supervised learning of Gaussian mixture models for visual vocabulary generation

Supervised learning of Gaussian mixture models for visual vocabulary generation

The participation payoff

Accurate visual word construction using a supervised approach

Contact Info

Product

Resources

About