Rough set theory, introduced by Zdzislaw Pawlak in the early 1980s [11, 12], is a new mathematical tool to deal with vagueness and uncertainty. This approach seems to be of fundamental importance to artificial intelligence (AI) and cognitive sciences, especially in the areas of machine learning, knowledge acquisition, decision analysis, knowledge discovery from databases, expert systems, decision support systems, inductive reasoning, and pattern recognition. The rough set concept overlaps-to some extent-with many other mathematical tools developed to deal with vagueness and uncertainty, in particular with the Dempster-Shafer theory of evidence [15]. The main difference is that the Dempster-Shafer theory uses belief functions as a main tool, while rough set theory makes use of sets-lower and upper approximations. Another relationship exists between fuzzy set theory and rough set theory [13]. Rough set theory does not compete with fuzzy set theory, with which it is frequently contrasted, but rather complements it [1]. In any case, rough set theory and fuzzy set theory are independent approaches to imperfect knowledge. Furthermore, some relationship exists between rough set theory and discriminant analysis [7], Boolean reasoning methods [16], and decision analysis [14]. One of the main advantages of rough set theory is that it does not need any preliminary or additional information about data, such as probability distribution in statistics, basic probability assignment in the Dempster-Shafer theory, or grade of membership or the value of possibility in fuzzy set theory [2]. AI emerging technologies
Real-life data usually are presented in databases by real numbers. On the other hand, most inductive learning methods require small number of attribute values. Thus it is necessary to convert input data sets with continuous attributes into input data sets with discrete attributes. Methods of discretization restricted to single continuous attributes will be called local, while methods that simultaneously convert all continuous attributes will be called global. In this paper, a method of transforming any local discretization method into a global one is presented. A global discretization method, based on cluster analysis, is presented and compared experimentally with three known local methods, transformed into global. Experiments include tenfold cross validation and leaving-one-out methods for ten real-life data sets.
Abstract:In the paper nine different approaches to missing attribute values are presented and compared. Ten input data files were used to investigate the performance of the nine methods to deal with missing attribute values. For testing both naive classification and new classification techniques of LERS (Learning from Examples based on Rough Sets) were used. The quality criterion was the average error rate achieved by ten-fold cross-validation. Using the Wilcoxon matched-pairs signed rank test, we conclude that the C4.5 approach and the method of ignoring examples with missing attribute values are the best methods among all nine approaches; the most common attribute-value method is the worst method among all nine approaches; while some methods do not differ from other methods significantly. The method of assigning to the missing attribute value all possible values of the attribute and the method of assigning to the missing attribute value all possible values of the attribute restricted to the same concept are excellent approaches based on our limited experimental results. However we do not have enough evidence to support the claim that these approaches are superior.
Abstract. The paper presents the system LERS for rule induction. The system handles inconsistencies in the input data due to its usage of rough set theory principle. Rough set theory is especialIy well suited to deal with inconsistencies. In this approach, inconsistencies are not corrected. Instead, system LERS computes lower and upper approximations of each concept. Then it induces certain rules and possible rules. The user has the choice to use the machine learning approach or the knowledge acquisition approach. In the first case, the system induces a single minimal discriminant description for each concept. In the second case, the system induces alI rules, each in the minimal form, that can be induced from the input data. In both cases, the user has a choice between the local or global approach.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.