1990
DOI: 10.1162/neco.1990.2.3.374
|View full text |Cite
|
Sign up to set email alerts
|

Exhaustive Learning

Abstract: Exhaustive exploration of an ensemble of networks is used to model learning and generalization in layered neural networks. A simple Boolean learning problem involving networks with binary weights is numerically solved to obtain the entropy S,, and the average generalization ability G, as a function of the size m of the training set. Learning curves G, vs rn are shown to depend solely on the distribution of generalization abilities over the ensemble of networks. Such distribution is determined prior to learning… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
16
0

Year Published

1993
1993
2019
2019

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 61 publications
(17 citation statements)
references
References 5 publications
1
16
0
Order By: Relevance
“…Here too rational convergence appears to be a natural form, as suggested in general by [1], proved more rigorously by [12], and demonstrated in a specific case study by [19]. Given the much stronger assumptions of this model, it is not too surprising that many researchers have indicated a dichotomy between rational and exponential average case learning curves [3,23,24]. The common suggestion is that this dichotomy is determined by the existence of``gaps'' between target concepts.…”
Section: Significance and Related Workmentioning
confidence: 82%
See 1 more Smart Citation
“…Here too rational convergence appears to be a natural form, as suggested in general by [1], proved more rigorously by [12], and demonstrated in a specific case study by [19]. Given the much stronger assumptions of this model, it is not too surprising that many researchers have indicated a dichotomy between rational and exponential average case learning curves [3,23,24]. The common suggestion is that this dichotomy is determined by the existence of``gaps'' between target concepts.…”
Section: Significance and Related Workmentioning
confidence: 82%
“…In fact, it is not even obvious what property distinguishes rational from exponential convergence for simple chains in this case. It has often been suggested [3,23,24] that density in the metric d P should distinguish rational from exponential worst case learning curves. However, this suggestion can easily be shown to be false: Consider a concept chain (C, P) where C=[q=[0, q] : q # Q[0, 1]] consists of rational initial segments [0, q] of the unit interval X=[0, 1], and P is any distribution on [0, 1] such that P(q)>0 for all (and only) rational points q # Q[0, 1].…”
Section: Discussion and Research Directionsmentioning
confidence: 99%
“…Our formalism can be used to give a classification of the large-or asymptotics of scaled learning curves 7, thus completing a classification program that has been suggested by several researchers (Amari et al, 1992;Schwartz et al, 1990;Seung et al, 1992). From Eq.…”
Section: Large-or Asymptotics Of Scaled Learning Curvesmentioning
confidence: 92%
“…In this paper, we show that ideas from statistical mechanics (namely, the annealed approximation (Amari et al, 1992;Levin et al, 1989;Schwartz et al, 1990;Sompolinsky et al, 1991) and the thermodynamic limit (Sompolinsky et al, 1991)) can be used as the basis of a mathematically precise and rigorous theory of learning curves 3. This theory will be distribution-specific, but will not attempt to force a power law form on learning curves.…”
Section: Introductionmentioning
confidence: 99%
“…Figure 4 shows that this bound is indeed well above the actual learning curves if the number of weights is used as an approximate value of dvc. The learning curves are also well described by E c ( 1/(M + Mo), predicted by statistical learning theories for problems with continuous generalization spectrum (Schwartz et al 1990), and the algorithms are equally robust to limited training sets.…”
Section: Monte Car10 Studiesmentioning
confidence: 73%