This paper introduces a software tool named KEEL which is a software tool to assess evolutionary algorithms for Data Mining problems of various kinds including as regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL, as well as the integration of evolutionary learning techniques with different pre-processing techniques, allowing it to perform a complete analysis of any learning model in comparison to existing software tools. Moreover, KEEL has been designed with a double goal: research and educational.
Recently, Adaboost has been compared to greedy backfitting of extended additive models in logistic regression problems, or "Logitboost". The Adaboost algorithm has been applied to learn fuzzy rules in classification problems, and other backfitting algorithms to learn fuzzy rules in modeling problems but, up to our knowledge, there are not previous works that extend the Logitboost algorithm to learn fuzzy rules in classification problems.In this work, Logitboost is applied to learn fuzzy rules in classification problems, and its results are compared with that of Adaboost and other fuzzy rule learning algorithms. Contradicting the expected results, it is shown that the basic extension of the backfitting algorithm to learn classification rules may produce worse results than Adaboost does. We suggest that this is caused by the stricter requirements that Logitboost demands to the weak learners, which are not fulfilled by fuzzy rules. Finally, it is proposed a prefitting based modification of the Logitboost algorithm that avoids this problem.
The action of OXA-24/40 and OXA-58 β-lactamase-like enzymes represents the main mechanism underlying resistance to carbapenems in Spain in the last decade. AbkA/AbkB proteins in the toxin/antitoxin system may be involved in the successful dissemination of plasmids carrying the bla(OXA-24/40)-like gene, and probably also the bla(OXA-58)-like gene, thus contributing to the plasmid stability.
In previous studies, we have shown that an Adaboost-based fitness can be successfully combined with a Genetic Algorithm to iteratively learn fuzzy rules from examples in classification problems. Unfortunately, some restrictive constraints in the implementation of the logical connectives and the inference method were assumed. Alas, the knowledge bases Adaboost produces are only compatible with an inference based on the maximum sum of votes scheme, and they can only use the t-norm product to model the "and" operator. This design is not optimal in terms of linguistic interpretability. Using the sum to aggregate votes allows many rules to be combined, when the class of an example is being decided. Because it can be difficult to isolate the contribution of individual rules to the knowledge base, fuzzy rules produced by Adaboost may be difficult to understand linguistically. In this point of view, single-winner inference would be a better choice, but it implies dropping some nontrivial hypotheses. In this work we introduce our first results in the search for a boosting-based genetic method able to learn weighted fuzzy rules that are compatible with this last inference method.
Backfitting of fuzzy rules is an Iterative RuleLearning technique for obtaining the knowledge base of a fuzzy rule-based system in regression problems. It consists in fitting one fuzzy rule to the data, and replacing the whole training set by the residual of the approximation. The obtained rule is added to the knowledge base, and the process is repeated until the residual is zero, or near zero. Such a design has been extended to imprecise data for which the observation error is small. Nevertheless, when this error is moderate or high, the learning can stop early. In this kind of algorithms, the specificity of the residual might decrease when a new rule is added. There may happen that the residual grows so wide that it covers the value zero for all points (thus the algorithm stops), but we have not yet extracted all the information available in the dataset. Focusing on this problem, this paper is about datasets with medium to high discrepancies between the observed and the actual values of the variables, such as those containing missing values and coarsely discretized data. We will show that the quality of the iterative learning degrades in this kind of problems, because it does not make full use of all the available information. As an alternative to sequentially obtaining rules, we propose a new multiobjective Genetic Cooperative Competitive Learning (GCCL) algorithm. In our approach, each individual in the population codifies one rule, which competes in the population in terms of maximum coverage and fitting, while the individuals in the population cooperate to form the knowledge base.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.