W e present the theory and some results of a new algorithm f o r Artificial Neural Nets which behaves well on complex data s e t s . T h e algorithm uses adaptive quadratic f o r m s as discriminant functions and is very fast compared with Back Propagation-improvements of four orders of magnitude have been obrained. 0 IntroductionConventional Neural Nets such as the multilayer feed forward Back-Propagation nets [l] are principally used in t h e r6le of pattern classifiers. We may describe this as the problem of assigning a catcgory to a new point in n-space, given a data set consisting of some sets of points of known category in t h e same space. T h e r e a r e statistical methods of tackling this problem which may be superior to neural net methods in particular cases, and there are hybrids such as thc probabilistic neural nets of Specht [2]. T h e chicf advantage of piecewise affine neural nets is that t h e y a r e relatively quick t o evaluate a new datum and assign it a category, and their disadvantage is that if the data set has any complexity of structure then training may take very long timcs (SCC Fahlman [3]). Probabilistic neural nets are somewhat implausible as models of real neurons but have many merits, being simple to train and reasonably fast in evaluation. In effect. they model the data so that t h e points of any one category are supposed to arisc from a numbcr of sphcrical gaussian distributions. There is no estimation of the centres of the gaussians howcver, and thc hypothesis of sphericity may mean that a large number of gaussians is required. I n t h i s paper wc consider quadratic neural nets. They may b e regarded a s being similar to Spccht's probabilistic neural nets in that they can implement a model of the data as some number of categories each of which is modelled as a mixture of gaussian (normal) distributions. Our net is adaptive in a way which is more similar to traditional neural nets than Specht's and whereas Spccht's nets havc only spherical distributions, we allow the full covariance matrix as well as the ccntre to be learnt. We use dynamical rather than statistical methods for adapting the state of the n c t since this is morc general, and have found that the results are not too diflcrent from, and frequently superior to, sequential gaussian mixture modelling algorithms. A comparison of the two will appear elsewhere. CH 3065-0/91xxxx1-1943 91.00 OIEEE
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.