-Cuban Schools for children with AffectiveBehavioral Maladies (SABM) have as goal to accomplish a major change in children behavior, to insert them effectively into society. One of the key elements in this objective is to give an adequate orientation to the children's families; due to the family is one of the most important educational contexts in which the children will develop their personality. The family orientation process in SABM involves clustering and classification of mixed type data with non-symmetric similarity functions. To improve this process, this paper includes some novel characteristics in clustering and prototype selection. The proposed approach uses a hierarchical clustering based on compact sets, making it suitable for dealing with non-symmetric similarity functions, as well as with mixed and incomplete data. The proposal obtains very good results on the SABM data, and over repository databases.
Purpose
The purpose of this paper is to improve the classification of families having children with affective-behavioral maladies, and thus giving the families a suitable orientation.
Design/methodology/approach
The proposed methodology includes three steps. Step 1 addresses initial data preprocessing, by noise filtering or data condensation. Step 2 performs a multiple feature sets selection, by using genetic algorithms and rough sets. Finally, Step 3 merges the candidate solutions and obtains the selected features and instances.
Findings
The new proposal show very good results on the family data (with 100 percent of correct classifications). It also obtained accurate results over a variety of repository data sets. The proposed approach is suitable for dealing with non-symmetric similarity functions, as well as with high-dimensionality mixed and incomplete data.
Originality/value
Previous work in the state of the art only considers instance selection to preprocess the schools for children with affective-behavioral maladies data. This paper explores using a new combined instance and feature selection technique to select relevant instances and features, leading to better classification, and to a simplification of the data.
In this paper, we introduce a new experimentation module for the recently developed EPIC software. EPIC is a tool for applying computational intelligence algorithms. The main advantages for our proposal concern the direct handling of mixed and incomplete data, the inclusion of several algorithms within the associative approach, and a very user-friendly graphical interface.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.