1997
DOI: 10.1006/jmva.1997.1687
|View full text |Cite
|
Sign up to set email alerts
|

Classification of Binary Vectors by Stochastic Complexity

Abstract: Stochastic complexity is treated as a tool of classification, i.e., of inferring the number of classes, the class descriptions, and the class memberships for a given data set of binary vectors. The stochastic complexity is evaluated with respect to the family of statistical models defined by finite mixtures of multivariate Bernoulli distributions obtained by the principle of maximum entropy. It is shown that stochastic complexity is asymptotically related to the classification maximum likelihood estimate. The … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

1997
1997
2009
2009

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 31 publications
(18 citation statements)
references
References 30 publications
0
18
0
Order By: Relevance
“…Binary data clustering has been widely studied in literature [25,29,33,42]. A unified view of binary data clustering has been provided by examining the connections among various methods including entropy-based methods, distance-based methods (e.g., K-means), mixture models, and matrix decomposition [38,39].…”
Section: Methodsmentioning
confidence: 99%
“…Binary data clustering has been widely studied in literature [25,29,33,42]. A unified view of binary data clustering has been provided by examining the connections among various methods including entropy-based methods, distance-based methods (e.g., K-means), mixture models, and matrix decomposition [38,39].…”
Section: Methodsmentioning
confidence: 99%
“…The first term in equation (3) describes the complexity of the classification and the second term the complexity of the strains with respect to the classification. Gyllenberg et al (1994b) also showed that minimizing the SC with respect to the model (2) amounts to maximizing the information content of the classification.…”
Section: Description Of Classesmentioning
confidence: 99%
“…Gyllenberg et al (1994b) showed that minimizing SC amounts to maximizing the information content of the classification. Thus increasing SC implies loss of information whereas decreasing SC indicates gain in information content.…”
Section: A Good Classification Should Have An Informationmentioning
confidence: 99%
See 2 more Smart Citations