Automated Construction of Classifications: Conceptual Clustering Versus Numerical Taxonomy

Michalski, Ryszard S.; Stepp, Robert E.

doi:10.1109/tpami.1983.4767409

Cited by 232 publications

(73 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…A variety of distance measures are in use in the various communities [3 4 5]. A simple distance measure like the Euclidean distance can often be used to reflect dissimilarity between two patterns, whereas other similarity measures can be used to characterize the conceptual similarity between patterns [6].…”

Section: B Components Of Clustering Taskmentioning

confidence: 99%

State of Art of Different Clustering Approaches

Rawlani¹,

Natekar²,

Pathak³

et al. 2015

International Journal of Advanced Research in Computer and Comm

View full text Add to dashboard Cite

This Paper presents an overview of the clustering and its methods used in Data Mining. Firstly, different measures that are used for determining whether two clusters are similar or dissimilar are defined. Then different methods of clustering are presented and are divided into: hierarchical, partitional and evolutionary algorithms. Finally clustering is performed in large data sets and subsequently their challenges are discussed.

show abstract

Section: B Components Of Clustering Taskmentioning

confidence: 99%

State of Art of Different Clustering Approaches

Rawlani¹,

Natekar²,

Pathak³

et al. 2015

International Journal of Advanced Research in Computer and Comm

View full text Add to dashboard Cite

show abstract

“…Another possibility would be to use conceptual clustering techniques [13], which inherently focus on descriptions of the clusters. However, conceptual clustering techniques are rather slow, while recent advances have made metric clustering techniques applicable to very large data sets [27,7].…”

Section: Related Workmentioning

confidence: 99%

An Automated Report Generation Tool for the Data Understanding Phase

Vesanto¹,

Hollmén²

2004

Innovations in Intelligent Systems

View full text Add to dashboard Cite

To successfully prepare and model data, the data miner needs to be aware of the properties of the data manifold. In this chapter, the outline of a tool for automatically generating data survey reports for this purpose is described. Such a report is used as a starting point for data understanding, acts as documentation of the data, and can easily be redone if necessary. The main focus is on describing the cluster structure and the contents of the clusters. The described system combines linguistic descriptions (rules) and statistical measures with visualizations. Whereas rules and mathematical measures give quantitative information, the visualizations give qualitative information on the data sets, and help the user to form a mental model of the data based on the suggested rules and other characterizations.

show abstract

“…Our work on UNIMEM and generalization-based memory is closely related to Michalski and Stepp's (1983) research on conceptual clustering, which they developed independently at about the same time.26 This approach also accepts feature-based instances as input and generates (from the top down) a hierarchical set of concept descriptions that summarizes them. However, the underlying mechanism is quite different from the one used by UNIMEM.…”

Section: Relation To Other Workmentioning

confidence: 99%

“…We constantly receive new examples and the world is not perfectly regular. The task of UNIMEM is basically that of conceptual clustering as presented by Michalski and Stepp (1983) and Fisher and Langley (1985), but our work also draws upon research in learning from examples (e.g., Winston, 1975;Mitchell, 1982;Dietterich & Michalski, 1986). However, in a learning by observation setting, one must consider not just how to compare examples, but also decide which examples to compare.…”

Section: Introductionmentioning

confidence: 99%

Untitled

Lebowitz

1987

Machine Learning

104

View full text Add to dashboard Cite

Abstract. Learning by observation involves automatic creation of categories that summarize experience. In this paper we present UNIMEM, an artificial intelligence system that learns by observation. UNIMEM is a robust program that can be run on many domains with real-world problem characteristics such as uncertainty, incompleteness, and large numbers of examples. We give an overview of the program that illustrates several key elements, including the automatic creation of non-disjoint concept hierarchies that are evaluated over time. We then describe several experiments that we have carried out with UNIMEM, including tests on different domains (universities, Congressional voting records, and terrorist events) and an examination of the effect of varying UNIMEM's parameters on the resulting concept hierarchies. Finally we discuss future directions for our work with the program. IntroductionLearning from observation is a task that is important in domains where examples are not pre-classified, but where one still wishes to detect general rules and intelligently organize examples. In this paper we discuss UNIMEM, a system that learns from observation by noticing regularities among examples and organizing them into a generalization hierarchy. We view UNIMEM both as implementing an algorithm for concept formation and as a prototype intelligent information system that can incorporate large amounts of data into memory and retrieve appropriate information in response to user queries. UNIMEM is not intended to be a psychological model per se, since it deals with a task more data-intensive than people are likely to perform. However, in developing the program we have made use of techniques derived by observing human behavior.The task of UNIMEM is to take a series of examples (or instances) that are expressed as collections of features and build up a generaliza-104 M. LEBOWITZ tion hierarchy of concepts. For example, UNIMEM might use information about a collection of universities to inductively determine the concepts of Ivy League universities, European technical universities, and so forth, and determine which examples are described by which concepts. The point of creating such concept descriptions is that they allow a performance element using the output of the program to make inferences about new examples based on partial information.Successful learning from real-world input must deal with several constraints. The key features that characterize the operation of UNIMEM are:• It learns by observation] it is not explicitly told how examples should be grouped into categories;• It is incremental] output must be available after processing each example; it cannot wait for all the input;• It must handle examples in large numbers (currently hundreds, eventually more);• Its generalizations are pragmatic] they need not perfectly describe all the instances they cover.1Although certain learning systems have dealt with tasks having some of these characteristics, little work has been concerned with all of them. However, all seem to characte...

show abstract

Automated Construction of Classifications: Conceptual Clustering Versus Numerical Taxonomy

Cited by 232 publications

References 3 publications

State of Art of Different Clustering Approaches

State of Art of Different Clustering Approaches

An Automated Report Generation Tool for the Data Understanding Phase

Untitled

Contact Info

Product

Resources

About