1996
DOI: 10.1109/78.553484
|View full text |Cite
|
Sign up to set email alerts
|

A global optimization technique for statistical classifier design

Abstract: A global optimization method is introduced for the design of statistical classi ers that minimize the rate of misclassi cation. We rst derive the theoretical basis for the method, based on which we develop a novel design algorithm and demonstrate its e ectiveness and superior performance in the design of practical classi ers for some of the most popular structures currently in use. The method, grounded in ideas from statistical physics and information theory, extends the deterministic annealing approach for op… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
60
0

Year Published

2005
2005
2021
2021

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 74 publications
(60 citation statements)
references
References 51 publications
0
60
0
Order By: Relevance
“…Though previous research proposing solutions for this problem exists (e.g., [31], [32]), it deals with the case of small number of classes and is not directly applicable to our case. The key idea of learning algorithms minimizing classification error is to replace the discrete misclassification cost function with some smooth approximation in order to be able to take a derivative of the cost function and perform gradient descent optimization.…”
Section: A Description Of Used Algorithmsmentioning
confidence: 99%
“…Though previous research proposing solutions for this problem exists (e.g., [31], [32]), it deals with the case of small number of classes and is not directly applicable to our case. The key idea of learning algorithms minimizing classification error is to replace the discrete misclassification cost function with some smooth approximation in order to be able to take a derivative of the cost function and perform gradient descent optimization.…”
Section: A Description Of Used Algorithmsmentioning
confidence: 99%
“…In general, higher correlations assist the bit-mapper as it uses all the received bits to correct errors, unlike the grouping approach which is forced to use only the bits within each group. From 4(b), it also follows that the overhead required to store the Bayesian network aggravates at higher N and the performance degrades further, making the Bayesian network approach impractical for very large networks 7 . Also, for these datasets, observe that the performance of the greedy-iterative descent method is considerably poorer than that using DA.…”
Section: Complexity-distortion Trade-offmentioning
confidence: 99%
“…Further, we impose the 'nearest prototype' structural constraint on the bit-mapper partitions by appropriately choosing a parametrization of the association probabilities. Similar methods have been used before in the context of design of tree-structured quantizers [13], generalized VQ design [11] and optimal classifier design [7]). It can be shown using the principle of entropy maximization that (refer to [13]), to impose a 'nearest prototype' structure, at each temperature, the association probabilities must be governed by the Gibbs distribution:…”
Section: Deterministic Annealing Based Designmentioning
confidence: 99%
See 1 more Smart Citation
“…The deterministic annealing (DA) technique has demonstrated substantial performance improvement over clustering, classification and constrained optimization problems [1,2,3,4,5]. Since DA is strongly motivated by the analogies to statistical physics [6], it regards the optimization problem in question as a thermal system.…”
Section: Introductionmentioning
confidence: 99%