Abstract-This paper describes a general fuzzy min-max (GFMM) neural network which is a generalization and extension of the fuzzy min-max clustering and classification algorithms developed by Simpson. The GFMM method combines the supervised and unsupervised learning within a single training algorithm. The fusion of clustering and classification resulted in an algorithm that can be used as pure clustering, pure classification, or hybrid clustering classification. This hybrid system exhibits an interesting property of finding decision boundaries between classes while clustering patterns that cannot be said to belong to any of existing classes. Similarly to the original algorithms, the hyperbox fuzzy sets are used as a representation of clusters and classes. Learning is usually completed in a few passes through the data and consists of placing and adjusting the hyperboxes in the pattern space which is referred to as an expansion-contraction process. The classification results can be crisp or fuzzy. New data can be included without the need for retraining. While retaining all the interesting features of the original algorithms, a number of modifications to their definition have been made in order to accommodate fuzzy input patterns in the form of lower and upper bounds, combine the supervised and unsupervised learning, and improve the effectiveness of operations.A detailed account of the GFMM neural network, its comparison with the Simpson's fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given.
The study is devoted to a granular analysis of data. We develop a new clustering algorithm that organizes findings about data in the form of a collection of information granules-hyperboxes. The clustering carried out here is an example of a granulation mechanism. We discuss a compatibility measure guiding a construction (growth) of the clusters and explain a rationale behind their development. The clustering promotes a data mining way of problem solving by emphasizing the transparency of the results (hyperboxes). We discuss a number of indexes describing hyperboxes and expressing relationships between such information granules. It is also shown how the resulting family of the information granules is a concise descriptor of the structure of the data-a granular signature of the data. We examine the properties of features (variables) occurring of the problem as they manifest in the setting of the information granules. Numerical experiments are carried out based on two-dimensional (2-D) synthetic data as well as multivariable Boston data available on the WWW.
Clustering forms one of the most visible conceptual and algorithmic framework of developing information granules. In spite of the algorithm being used, the representation of information granules-clusters is predominantly numeric (coming in the form of prototypes, partition matrices, dendrograms, etc.). In this paper, we consider a concept of granular prototypes that generalizes the numeric representation of the clusters and, in this way, helps capture more details about the data structure. By invoking the granulation-degranulation scheme, we design granular prototypes being reflective of the structure of data to a higher extent than the representation that is provided by their numeric counterparts (prototypes). The design is formulated as an optimization problem, which is guided by the coverage criterion, meaning that we maximize the number of data for which their granular realization includes the original data. The granularity of the prototypes themselves is treated as an important design asset; hence, its allocation to the individual prototypes is optimized so that the coverage criterion becomes maximized. With this regard, several schemes of optimal allocation of information granularity are investigated, where interval-valued prototypes are formed around the already produced numeric representatives. Experimental studies are provided in which the design of granular prototypes of interval format is discussed and characterized.
This paper contributes to the conceptual and algorithmic framework of information granulation. We revisit the role of information granules that are relevant to several main classes of technical pursuits involving temporal and spatial granulation. A detailed algorithm of information granulation, regarded as an optimization problem reconciling two conflicting design criteria, namely, a specificity of information granules and their experimental relevance (coverage of numeric data), is provided in the paper. The resulting information granules are formalized in the language of set theory (interval analysis). The uniform treatment of data points and data intervals (sets) allows for a recursive application of the algorithm. We assess the quality of information granules through application of the fuzzy c-means (FCM) clustering algorithm. Numerical studies deal with two-dimensional (2D) synthetic data and experimental traffic data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.