Abstract-Five methods that generate multiple prototypes from labeled data are reviewed. Then we introduce a new sixth approach, which is a modification of Chang's method. We compare the six methods with two standard classifier designs: the 1-nearest prototype (1-np) and 1-nearest neighbor (1-nn) rules. The standard of comparison is the resubstitution error rate; the data used are the Iris data. Our modified Chang's method produces the best consistent (zero errors) design. One of the competitive learning models produces the best minimal prototypes design (five prototypes that yield three resubstitution errors).Index Terms-Competitive learning, Iris data, modified Chang's method (MCA), multiple prototypes, nearest neighbor (1-nn) rule.
W e present the theory and some results of a new algorithm f o r Artificial Neural Nets which behaves well on complex data s e t s . T h e algorithm uses adaptive quadratic f o r m s as discriminant functions and is very fast compared with Back Propagation-improvements of four orders of magnitude have been obrained. 0 IntroductionConventional Neural Nets such as the multilayer feed forward Back-Propagation nets [l] are principally used in t h e r6le of pattern classifiers. We may describe this as the problem of assigning a catcgory to a new point in n-space, given a data set consisting of some sets of points of known category in t h e same space. T h e r e a r e statistical methods of tackling this problem which may be superior to neural net methods in particular cases, and there are hybrids such as thc probabilistic neural nets of Specht [2]. T h e chicf advantage of piecewise affine neural nets is that t h e y a r e relatively quick t o evaluate a new datum and assign it a category, and their disadvantage is that if the data set has any complexity of structure then training may take very long timcs (SCC Fahlman [3]). Probabilistic neural nets are somewhat implausible as models of real neurons but have many merits, being simple to train and reasonably fast in evaluation. In effect. they model the data so that t h e points of any one category are supposed to arisc from a numbcr of sphcrical gaussian distributions. There is no estimation of the centres of the gaussians howcver, and thc hypothesis of sphericity may mean that a large number of gaussians is required. I n t h i s paper wc consider quadratic neural nets. They may b e regarded a s being similar to Spccht's probabilistic neural nets in that they can implement a model of the data as some number of categories each of which is modelled as a mixture of gaussian (normal) distributions. Our net is adaptive in a way which is more similar to traditional neural nets than Specht's and whereas Spccht's nets havc only spherical distributions, we allow the full covariance matrix as well as the ccntre to be learnt. We use dynamical rather than statistical methods for adapting the state of the n c t since this is morc general, and have found that the results are not too diflcrent from, and frequently superior to, sequential gaussian mixture modelling algorithms. A comparison of the two will appear elsewhere. CH 3065-0/91xxxx1-1943 91.00 OIEEE
There are four types of class labelscrisp, fuzzy, probabilistic and possibilistic.Let integer c denote the number of classes, 1 < c < n, and define three sets of label vectors in %'as follows : If Xtr is large enough and its substructure is well delineated, we expect classifiers trained with it to yield small error rates. On the other hand, when the training data are large in dimension p and/or number a, classifiers such as the k-nearest neighbor (Dk-nn) rule [5,6] can require too much storage and CPU time for efficient deployment. Here we discuss 6 ways to replace Xtr with a set of prototypes V that can be used as a substitute for % (e.g., in the nearest neighbor rule) without appreciable degradation in E D~-~~: ( X~~ I X d . In this case Dk-nn becomes a nearest prototype design with error rate ED(Xte I V) Nearest prototype classifiersOnce the prototypes V are found (and possibly relabeled if the data have physical labels), they can be used to define a crisp nearest prototype (1 -np) classifier, say Dv,6 :The nearest prototype (1-np) Classifier. Given any c prototypes V = (V.E 32' : 12 j l c J } , one v. /class, and any dis-similarity J measure 6 on 3 ' : for any z E 3':Ties in (3) are arbitrarily resolved. The crisp 1-np design can be implemented using 627 prototypes from a n y algorithm that produces them. Equation (3) defines a crisp classifier, even when V comes from a fuzzy, probabilistic or possibilistic algorithm. When one or more classes are represented by multiple prototypes, there are two ways to extend the 1-np design. We can simply use equation (3), recognizing that V contains more than one prototype for at least one of the c classes. Or we can extend the 1-np design to a k-np rule, wherein the k nearest prototypes are used to conduct a vote about the label that should be assigned to input z.This amounts to operating the k-nn rule using prototypes (points built from the data) instead of neighbors (points in the data). We opt here for the simpler choice, which is formalized as the The nearest multiple prototype (1-nmp] classifier. Given a n y Np prototypes Pj V = ( v i j E 9 1 P : l < i < c ; 1 1 j l n Pj I ,where n is the number of prototypes for class j , N, = Cnpj: and a n y dis-similarity measure F on sp: for any z E 9lP: As in ( 3 ) , ties in (3') are resolved arbitrarily. We use the same notation for the 1-np and 1-nmp classifiers, relying on context to identify which one is being discussed. Now we are ready to turn to methods for finding multiple prototypes. Three sequeSequential learning models update estimates at iterate (t-1) of the {vi) at iterate t (one iteration is one pass through X) upon presentation of an xk from X using the general form, i = 1, 2, -.., c:Vi.t = Vi,t-l 4-aik,t(Xk -vi,t-l) * (4) (i) where I n (4) {aik,t) is the learning r a t e distribution over the c nodes for input xk at iterate t. The principle difference between various competitive learning models lies in 628The DR user must specify an initial distribution for the (fik,t 2 1) , and four constants : a rate of...
We describe an automatic target recognition system for detecting targets in temporal sequences of intensity LADAR images. The system first finds all objects in the images using a method that finds blobs and curves.Then features of the objects are extracted. Next fuzzy c-means (FCM) is used to cluster the objects. Finally, FCM prototypes for each class of objects are relabeled with the training data and used to classify unknown objects using a nearest prototype classifier.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.