A source coding approach to classification by vector quantization and the principle of minimum description length

Li, J.

doi:10.1109/dcc.2002.999978

Cited by 6 publications

(2 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, when detecting a face in an image, features associated with the face often have a low-dimensional structure which is "embedded" as a submanifold in a cloud of essentially random features from the background. Model selection criteria such as the minimum description length (MDL) [28,22] serve as important modifications to MAP for estimating a model across classes of different complexity. MDL selects the model that minimizes the overall coding length of the given (training) data, hence the name "minimum description length" or "minimum coding length" [1].…”

Section: Issues With Learning the Distributions From Training Samplesmentioning

confidence: 99%

Classification via Minimum Incremental Coding Length

Wright¹,

Ma²,

Tao³

et al. 2009

SIAM J. Imaging Sci.

View full text Add to dashboard Cite

We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We demonstrate the asymptotic optimality of this criterion for Gaussian distributions and analyze its relationships to classical classifiers. The theoretical results clarify the connections between our approach and popular classifiers such as maximum a posteriori (MAP), regularized discriminant analysis (RDA), k-nearest neighbor (k-NN), and support vector machine (SVM), as well as unsupervised methods based on lossy coding. Our formulation induces several good effects on the resulting classifier. First, minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small sample setting. Second, compression provides a uniform means of handling classes of varying dimension. The new criterion and its kernel and local versions perform competitively on synthetic examples, as well as on real imagery data such as handwritten digits and face images. On these problems, the performance of our simple classifier approaches the best reported results, without using domainspecific information. All MATLAB code and classification results are publicly available for peer evaluation at http://perception.csl.uiuc.edu/coding/home.htm.

show abstract

Section: Issues With Learning the Distributions From Training Samplesmentioning

confidence: 99%

Classification via Minimum Incremental Coding Length

Wright¹,

Ma²,

Tao³

et al. 2009

SIAM J. Imaging Sci.

View full text Add to dashboard Cite

show abstract

“…However, it is hard to conclude they are with the same interestingness. Intuitively, c→ C 0 is the most interesting one, for it only contains 1 item in this antecedent and is favored by the Minimal Description Length Principle [8]. The situation for other four rules is more complicated and a systemic measure is deserved.…”

Section: Introductionmentioning

confidence: 99%

A Kernel Density Estimation Based Interestingness Measure for Association Rule Mining

Hao

Cai

et al. 2010

AMM

View full text Add to dashboard Cite

Association rules provide a concise statement of potentially useful information, and have been widely used in real applications. However, the usefulness of association rules highly depends on the interestingness measure which is used to select interesting rules from millions of candidates. In this study, a probability analysis of association rules is conducted, and a discrete kernel density estimation based interestingness measure is proposed accordingly. The new proposed interestingness measure makes the most of the information contained in the data set and obtains much lower falsely discovery rate than the existing interestingness measures. Experimental results show the effectiveness of the proposed interestingness measure.

show abstract

Fundamentals of Vector Quantization

Khan

2005

Handbook of Image and Video Processing

View full text Add to dashboard Cite

A source coding approach to classification by vector quantization and the principle of minimum description length

Cited by 6 publications

References 17 publications

Classification via Minimum Incremental Coding Length

Classification via Minimum Incremental Coding Length

A Kernel Density Estimation Based Interestingness Measure for Association Rule Mining

Fundamentals of Vector Quantization

Contact Info

Product

Resources

About