In this article, first, we propose a novel unsupervised learning method based on a hierarchical Dirichlet process mixture of shifted-scaled Dirichlet (SSD) distributions. Second, we extend it to a hierarchical Pitman-Yor process mixture of SSD distributions. The goal is to find a model that properly fits complex real-world data. Our models are based on SSD distributions that are more flexible than Dirichlet distribution in fitting proportional data. Simultaneous data fitting (parameter estimate) and model selection (model complexity determination) are possible with the suggested methods. We applied batch and online variational inference for learning the models. The online setting allows us to feed our models with large-scale streaming data. The effectiveness of our proposed models is evaluated by four realistic and challenging applications, namely, spam email detection, texture clustering, traffic sign detection, and vehicle detection.Experimental results demonstrate the potential of our models to fit proportional data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.