Decision Tree Classifiers (DTC's) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert
In this paper, we study the use of unlabeled samples in reducing the problem of small training sample size that can severely affect the recognition rate of classifiers when the dimensionality of the multispectral data is high. We show that by using additional unlabeled samples that are available at no extra cost, the performance may be improved, and therefore the Hughes phenomenon can be mitigated. Furthermore, by experiments, we show that by using additional unlabeled samples more representative estimates can be obtained. We also propose a semiparametric method for incorporating the training (i.e., labeled) and unlabeled samples simultaneously into the parameter estimation process.
As the number of spectral bands of high spectral resolution data increases, the capability to detect more detailed classes should also increase, and the classification accuracy should increase as well. Often the number of labeled samples used for supervised classification techniques is limited, thus limiting the precision with which class characteristics can be estimated. As the number of spectral bands becomes large, the limitation on performance imposed by the limited number of training samples can become severe. A number of techniques for case-specific feature extraction have been developed to reduce dimensionality without loss of class separability. Most of these techniques require the estimation of statistics at full dimensionality in order to extract relevant features for classification. If the number of training samples is not adequately large, the estimation of parameters in high dimensional data will not be accurate enough. As a result, the estimated features may not be as effective as they could be. This suggests the need for reducing the dimensionality via a preprocessing method that takes into consideration high dimensional feature space properties. Such reduction should enable the estimation of feature extraction parameters to be more accurate. Using a technique referred to as Projection Pursuit, such an algorithm has been developed. This technique is able to bypass many of the problems of the limitation of small numbers of training samples by making the computations in a lower dimensional space, and optimizing a function called the projection index. A current limitation on this method is that as the number of dimensions increases, it is highly probable to find a local maximum 1 Work reported herein was funded in part by NASA Grant NAGW-3924.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.