One of the popular methods to develop an algorithm for mining data stored in a relational structure is to upgrade an existing attribute-value algorithm to a relational case. Current approaches to this problem have some shortcomings such as (1) a dependence on the upgrading process of the algorithm to be extended, (2) complicated redefinitions of crucial notions (e.g., pattern generality, pattern refinement), and (3) a tolerant limitation of the search space for pattern discovery. In this paper, we propose and evaluate a general methodology for upgrading a data mining framework to a relational case. This methodology is defined in a granular computing environment. Thanks to our relational extension of a granular computing based data mining framework, the three above problems can be overcome. C 2014 Wiley Periodicals, Inc.In the following sections, we review other research works related to problem addressed in this paper (Section 2), introduce a granular computing based data mining framework (Section 3), and based on it, we develop (Section 4) and evaluate (Section 5), a general methodology for mining relational data, and finally provide concluding remarks (Section 6).
RELATED WORKSMany of first algorithms for mining relational data were developed based on algorithms devoted to propositional data. The task of upgrading a standard data mining algorithm to a relational case is not trivial and requires much attention. An upgraded algorithm should preserve as many features of the original algorithm as possible. In other words, only crucial notions, for example, data and patterns representation, are upgraded. Furthermore, the original algorithm should be a special case of its relational counterpart, that is, they both should produce the same results for identical propositional data. Comparing the above-described information systems with the ones employed in this paper, one can state that a restricted compound information system, introduced in this paper, is most similar to the constrained sum of information systems. On one hand, our system can be treated as a special case of the other system because the constraint, which is generally defined in the latter, can be formed by using formulas that join tables. On the other hand, our system is constructed in a different way because the constraint is not imposed directly, that is, left outer joins on the formulas are used.One can conclude that the existing approaches propose useful information systems to deal with data stored in a relational structure; however, they do not study extensively or at all the problem of construction of information granules for relational data using an (extended) attribute-value language.
GRANULAR COMPUTING BASED DATA MINING FRAMEWORKIn this section, we introduce a granular computing based data mining framework, 7 which is constructed based on definitions from Refs. 18, 20. SEM I S i∧j (α 1 ∧ α 2 ) = SEM I S i∧j (α 1 ) ∩ SEM I S i∧j (α 2 ); 4. α 1 , α 2 ∈ L I S i∧j ⇒ α 1 ∨ α 2 ∈ L I S i∧j , SEM I S i∧j (α 1 ∨ α 2 ) = SEM I S i∧j (α 1 ) ∪ SEM I S i∧j ...