Simultaneous image classification and annotation

Wang, Chong; Blei, David M.; Li, Feifei

doi:10.1109/cvpr.2009.5206800

Cited by 209 publications

(51 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Blei et.al [8] proposed LDA, a generative probabilistic topic model, which was recently extended to Supervised LDA (SLDA) [11] which includes the response variable in the model itself. Figure 1 shows the graphical models for LDA and SLDA models as adapted to Latent Facial Topics.…”

Section: Extracting Lfts Using Topic Modelsmentioning

confidence: 99%

“…shape or appearance features, which we call base features) obtained from face images. We used 2 popular probabilistic topic models, Latent Dirichlet Allocation (LDA) [8] which is unsupervised and Supervised Latent Dirichlet Allocation [11] which is supervised, to extract LFTs from these facial documents. SLDA can directly predict continuous values and is used for continuous emotion recognition whereas LDA is unsupervised and we need an extra classifier or regressor to predict discrete or continuous emotions.…”

Section: Extracting Lfts Using Topic Modelsmentioning

confidence: 99%

“…We seek to use probabilistic topic models (LDA in particular) to automatically discover Latent Facial Topics (LFTs) for affect recognition in discrete and continuous spaces. While topic models have traditionally been unsupervised, there have been recent efforts in developing supervised topic models like supervised LDA (SLDA) for regression [11]. We also describe how our method can be incorporated with supervised topic models (supervised LDA) to directly use Latent Facial Topics for predicting continuous emotional responses directly using the model itself.…”

Section: Introductionmentioning

confidence: 97%

See 2 more Smart Citations

Latent Facial Topics for affect analysis

Lade

Balasubramanian

Panchanathan

2013

2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

View full text Add to dashboard Cite

Recent years have seen a growing need in the affective computing community to understand an emotion space beyond the seven basic expressions, leading to explorations of an emotion space continuum spanned by dimensions such as valence and arousal. While there has been substantial research in the identification of facial Action Units as building blocks for the basic expressions, there is a new need to discover fine-grained facial descriptors that can explain the variations in the continuum of emotions. We propose a methodology to extract Latent Facial Topics (LFTs) from facial videos, by adapting Latent Dirichlet Allocation and supervised Latent Dirichlet Allocation topic models for facial affect analysis. In this work, we study the application of topic models to both discrete emotion recognition as well as continuous emotion prediction tasks. We show that meaningful and visualizable LFTs can be extracted and used successfully for emotion recognition. We report our recognition results on the widely known Cohn Kanade Plus and AVEC 2012 FCSC challenge data sets, which have shown promise for both discrete and continuous emotion recognition problems.

show abstract

Section: Extracting Lfts Using Topic Modelsmentioning

confidence: 99%

Section: Extracting Lfts Using Topic Modelsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 97%

See 1 more Smart Citation

Latent Facial Topics for affect analysis

Lade

Balasubramanian

Panchanathan

2013

2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

View full text Add to dashboard Cite

show abstract

“…Motivated by this, in this paper we propose a histogram of the SIFT codewords as a target appearance model. The SIFT [11] descriptors are robust in illumination and viewpoint changes, successful in various tasks including image classification, matching, and annotation [12,13]. We follow the standard protocol to assign a SIFT codeword to each point in a densely sampled grid.…”

Section: Introductionmentioning

confidence: 99%

Adaptive Bayesian Object Tracking with Histograms of Dense Local Image Descriptors

Kim

2016

IJFIS

View full text Add to dashboard Cite

Dense local image descriptors like SIFT are fruitful for capturing salient information about image, shown to be successful in various image-related tasks when formed in bag-of-words representation (i.e., histograms). In this paper we consider to utilize these dense local descriptors in the object tracking problem. A notable aspect of our tracker is that instead of adopting a point estimate for the target model, we account for uncertainty in data noise and model incompleteness by maintaining a distribution over plausible candidate models within the Bayesian framework. The target model is also updated adaptively by the principled Bayesian posterior inference, which admits a closed form within our Dirichlet prior modeling. With empirical evaluations on some video datasets, the proposed method is shown to yield more accurate tracking than baseline histogram-based trackers with the same types of features, often being superior to the appearance-based (visual) trackers.

show abstract

“…They have been widely used to help people understand and navigate document collections (Blei et al, 2003), multilingual collections (Hu et al, 2014), images (Chong et al, 2009), networks (Chang and Blei, 2009;Yang et al, 2016), etc. Probabilistic topic modeling usually requires computing a posterior distribution over thousands or millions of latent variables, which is often intractable.…”

mentioning

confidence: 99%

Why ADAGRAD Fails for Online Topic Modeling

Lü

Lund

Boyd-Graber

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Online topic modeling, i.e., topic modeling with stochastic variational inference, is a powerful and efficient technique for analyzing large datasets, and ADAGRAD is a widely-used technique for tuning learning rates during online gradient optimization. However, these two techniques do not work well together. We show that this is because ADAGRAD uses accumulation of previous gradients as the learning rates' denominators. For online topic modeling, the magnitude of gradients is very large. It causes learning rates to shrink very quickly, so the parameters cannot fully converge until the training ends.Probabilistic topic models (Blei, 2012) are popular algorithms for uncovering hidden thematic structure in text. They have been widely used to help people understand and navigate document collections (Blei et al., 2003), multilingual collections (Hu et al., 2014), images (Chong et al., 2009), networks (Chang andYang et al., 2016), etc. Probabilistic topic modeling usually requires computing a posterior distribution over thousands or millions of latent variables, which is often intractable. Variational inference (Blei et al., 2016, VI) approximates posterior distributions. Stochastic variational inference (Hoffman et al., 2013, SVI) is its natural online extension and enables the analysis of large datasets.Online topic models (Hoffman et al., 2010;Bryant and Sudderth, 2012;Paisley et al., 2015) optimize the global parameters of interest using stochastic gradient ascent. At each iteration, they sample data points to estimate the gradient. In practice, the sample has only a small percentage of the vocabulary. The resulting sparse gradients hurt performance. ADAGRAD (Duchi et al., 2011) is designed for high dimensional online optimization problems and adjusts learning rates for each dimension, favoring rare features. This makes ADAGRAD well-suited for tasks with sparse gradients such as distributed deep networks (Dean et al., 2012), forward-backward splitting (Duchi and Singer, 2009), and regularized dual averaging methods (Xiao, 2010).Thus, it may seem reasonable to apply ADA-GRAD to optimize online topic models. However, ADAGRAD is not suitable for online topic models (Section 1). This is because to get a topic model, the training algorithm must break the symmetry between parameters of words that are highly related to the topic and words that are not related to the topic. Before the algorithm converges, the magnitude of gradients of the parameters are very large. Since ADAGRAD uses the accumulation of previous gradients as learning rates' denominators, the learning rates shrink very quickly. Thus, the algorithm cannot break the symmetry quickly. We provide solutions for this problem. Two alternative learning rate methods, i.e., ADADELTA (Zeiler, 2012) and ADAM (Kingma and Ba, 2014), can address this incompatibility with online topic models. When the dataset is small enough, e.g., a corpus with only hundreds of documents, ADAGRAD can still work.

show abstract

Simultaneous image classification and annotation

Cited by 209 publications

References 19 publications

Latent Facial Topics for affect analysis

Latent Facial Topics for affect analysis

Adaptive Bayesian Object Tracking with Histograms of Dense Local Image Descriptors

Why ADAGRAD Fails for Online Topic Modeling

Contact Info

Product

Resources

About