2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence) 2008
DOI: 10.1109/ijcnn.2008.4634086
|View full text |Cite
|
Sign up to set email alerts
|

A formula of equations of states in singular learning machines

Abstract: Almost all learning machines used in computational intelligence are not regular but singular statistical models, because they are nonidentifiable and their Fisher information matrices are singular. In singular learning machines, neither the Bayes a posteriori distribution converges to the normal distribution nor the maximum likelihood estimator satisfies the asymptotic normality, resulting that it has been difficult to estimate generalization performances. In this paper, we establish a formula of equations of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 25 publications
0
9
0
Order By: Relevance
“…To address that problem, Vehtari et al (2017) proposed Pareto smoothed importance sampling, a new algorithm for regularizing importance weights, and developed a numerical tool (Vehtari et al, 2018) to facilitate computation. Watanabe (2010) established a singular learning theory and proposed a new criterion named Watanabe-Akaike (Gelman et al, 2014), or widely applicable information criterion (WAIC; Watanabe 2008Watanabe , 2009, while WAIC 1 was proposed for the plug-in discrepancy and WAIC 2 for the posterior averaged discrepancy. However, compared with BPIC and PAIC, we found that WAIC 2 tends to have larger bias and variation for regular Bayesian models, as shown in simulation studies in the next section.…”
Section: Posterior Averaging Information Criterionmentioning
confidence: 99%
“…To address that problem, Vehtari et al (2017) proposed Pareto smoothed importance sampling, a new algorithm for regularizing importance weights, and developed a numerical tool (Vehtari et al, 2018) to facilitate computation. Watanabe (2010) established a singular learning theory and proposed a new criterion named Watanabe-Akaike (Gelman et al, 2014), or widely applicable information criterion (WAIC; Watanabe 2008Watanabe , 2009, while WAIC 1 was proposed for the plug-in discrepancy and WAIC 2 for the posterior averaged discrepancy. However, compared with BPIC and PAIC, we found that WAIC 2 tends to have larger bias and variation for regular Bayesian models, as shown in simulation studies in the next section.…”
Section: Posterior Averaging Information Criterionmentioning
confidence: 99%
“…The concept of the functional variance was firstly proposed in the papers [18,19,20,21]. In this paper, we show that the functional variance plays an important role in learning In theoretical analysis, we assume some conditions on a true distribution and a learning machine.…”
Section: Bayes Learningmentioning
confidence: 99%
“…In the previous papers [18,19,20,21,22], we studied a case when a true distribution is parametrizable and singular, and proved new formulas which enable us to estimate the generalization loss from the training loss and the functional variance. Since the new formulas hold for an arbitrary set of a true distribution, a learning machine, and an a priori distribution, they are called equations of states in statistical estimation.…”
Section: Introductionmentioning
confidence: 99%
“…Statistical models having singular regions are called singular models, and many useful models such as MLPs, RBFs, Gaussian mixtures, and HMMs are singular models (Watanabe, 2009). Learning theory of singular models has been studied intensively (Watanabe, 2008;Watanabe, 2009); however, empirical studies of singular models have been scarcely done. As partial knowledge, we know that an MLP search space has extensive flat areas and troughs (Hecht-Nielsen, 1990), or most points along a search route have huge condition numbers (Nakano et al, 2011).…”
Section: Introductionmentioning
confidence: 99%