We propose a novel framework for generating classification rules from relational data. This is a specialized version of the general framework intended for mining relational data and is defined in granular computing theory. In the framework proposed in this paper we define a method for deriving information granules from relational data. Such granules are the basis for generating relational classification rules. In our approach we follow the granular computing idea of switching between different levels of granularity of the universe. Thanks to this a granule-based relational data representation can easily be replaced by another one and thereby adjusted to a given data mining task, e.g. classification. A generalized relational data representation, as defined in the framework, can be treated as the search space for generating rules. On account of this the size of the search space may significantly be limited. Furthermore, our framework, unlike others, unifies not only the way the data and rules to be derived are expressed and specified, but also partially the process of generating rules from the data. Namely, the rules can be directly obtained from the information granules or constructed based on them.
One of the popular methods to develop an algorithm for mining data stored in a relational structure is to upgrade an existing attribute-value algorithm to a relational case. Current approaches to this problem have some shortcomings such as (1) a dependence on the upgrading process of the algorithm to be extended, (2) complicated redefinitions of crucial notions (e.g., pattern generality, pattern refinement), and (3) a tolerant limitation of the search space for pattern discovery. In this paper, we propose and evaluate a general methodology for upgrading a data mining framework to a relational case. This methodology is defined in a granular computing environment. Thanks to our relational extension of a granular computing based data mining framework, the three above problems can be overcome. C 2014 Wiley Periodicals, Inc.In the following sections, we review other research works related to problem addressed in this paper (Section 2), introduce a granular computing based data mining framework (Section 3), and based on it, we develop (Section 4) and evaluate (Section 5), a general methodology for mining relational data, and finally provide concluding remarks (Section 6). RELATED WORKSMany of first algorithms for mining relational data were developed based on algorithms devoted to propositional data. The task of upgrading a standard data mining algorithm to a relational case is not trivial and requires much attention. An upgraded algorithm should preserve as many features of the original algorithm as possible. In other words, only crucial notions, for example, data and patterns representation, are upgraded. Furthermore, the original algorithm should be a special case of its relational counterpart, that is, they both should produce the same results for identical propositional data. Comparing the above-described information systems with the ones employed in this paper, one can state that a restricted compound information system, introduced in this paper, is most similar to the constrained sum of information systems. On one hand, our system can be treated as a special case of the other system because the constraint, which is generally defined in the latter, can be formed by using formulas that join tables. On the other hand, our system is constructed in a different way because the constraint is not imposed directly, that is, left outer joins on the formulas are used.One can conclude that the existing approaches propose useful information systems to deal with data stored in a relational structure; however, they do not study extensively or at all the problem of construction of information granules for relational data using an (extended) attribute-value language. GRANULAR COMPUTING BASED DATA MINING FRAMEWORKIn this section, we introduce a granular computing based data mining framework, 7 which is constructed based on definitions from Refs. 18, 20. SEM I S i∧j (α 1 ∧ α 2 ) = SEM I S i∧j (α 1 ) ∩ SEM I S i∧j (α 2 ); 4. α 1 , α 2 ∈ L I S i∧j ⇒ α 1 ∨ α 2 ∈ L I S i∧j , SEM I S i∧j (α 1 ∨ α 2 ) = SEM I S i∧j (α 1 ) ∪ SEM I S i∧j ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.