Abstract. In the last few years, the data mining community has proposed a number of objective rule interestingness measures to select the most interesting rules, out of a large set of discovered rules. However, it should be recalled that objective measures are just an estimate of the true degree of interestingness of a rule to the user, the so-called real human interest. The latter is inherently subjective. Hence, it is not clear how effective, in practice, objective measures are. More precisely, the central question investigated in this paper is: "how effective objective rule interestingness measures are, in the sense of being a good estimate of the true, subjective degree of interestingness of a rule to the user?" This question is investigated by extensive experiments with 11 objective rule interestingness measures across eight real-world data sets.
This paper addresses the well-known classification task of data mining, where the objective is to predict the class which an example belongs to. Discovered knowledge is expressed in the form of high-level, easy-to-interpret classification rules. In order to discover classification rules, we propose a hybrid decision tree/genetic algorithm method. The central idea of this hybrid method involves the concept of small disjuncts in data mining, as follows. In essence, a set of classification rules can be regarded as a logical disjunction of rules, so that each rule can be regarded as a disjunct. A small disjunct is a rule covering a small number of examples. Due to their nature, small disjuncts are error prone. However, although each small disjunct covers just a few examples, the set of all small disjuncts can cover a large number of examples, so that it is important to develop new approaches to cope with the problem of small disjuncts. In our hybrid approach, we have developed two genetic algorithms (GA) specifically designed for discovering rules covering examples belonging to small disjuncts, whereas a conventional decision tree algorithm is used to produce rules covering examples belonging to large disjuncts. We present results evaluating the performance of the hybrid method in 22 real-world data sets. _____________________________________________________________________________________________
This paper addresses the well-known classification task of data mining, where the goal is to discover rules predicting the class of examples (records of a given data set). In the context of data mining, small disjuncts are rules covering a small number of examples. Hence, these rules are usually error-prone, which contributes to a decrease in predictive accuracy. At first glance, this is not a serious problem, since the impact on predictive accuracy should be small. However, although each small disjunct covers few examples, the set of all small disjuncts can cover a large number of examples. This paper presents evidence that this is the case in several data sets. This paper also addresses the problem of small disjuncts by using a hybrid decision-tree/genetic algorithm approach. In essence, examples belonging to large disjuncts are classified by rules produced by a decision-tree algorithm (C4.5), while examples belonging to small disjuncts are classified by a genetic algorithm specifically designed for discovering small-disjunct rules. We present results comparing the predictive accuracy of this hybrid system with the prediction accuracy of three versions of C4.5 alone in eight public domain data sets. Overall, the results show that our hybrid system achieves better predictive accuracy than all three versions of C4.5 alone.
O objetivo deste trabalho foi avaliar a usabilidade e as dificuldades encontradas por 99 profissionais de enfermagem no manuseio de prontuário eletrônico do paciente. Pesquisa exploratória quantitativa a partir da coleta de dados no período de julho a novembro de 2013. Os resultados demostram que 71% dos auxiliares/técnicos e 70% dos enfermeiros não receberam treinamento específico; sendo que 56% da equipe, que respondeu não ter recebido treinamento, apresenta dificuldade no uso. Dentre as características avaliadas de usabilidade do prontuário eletrônico do paciente destacam-se positivamente a adequação à tarefa e negativamente à adequação ao aprendizado. Portanto, o sistema avaliado, apesar dos avanços advindos, ainda se apresenta complexo para o usuário que não recebeu treinamento, apesar de possuir interface consistente e interativa.
This paper addresses the well-known classification task of data mining, where the objective is to predict the class which an example belongs to. Discovered knowledge is expressed in the form of high-level, easy-to-interpret classification rules. In order to discover classification rules, we propose a hybrid decision tree/genetic algorithm method. The central idea of this hybrid method involves the concept of small disjuncts in data mining, as follows. In essence, a set of classification rules can be regarded as a logical disjunction of rules, so that each rule can be regarded as a disjunct. A small disjunct is a rule covering a small number of examples. Due to their nature, small disjuncts are error prone. However, although each small disjunct covers just a few examples, the set of all small disjuncts can cover a large number of examples, so that it is important to develop new approaches to cope with the problem of small disjuncts. In our hybrid approach, we have developed two genetic algorithms (GA) specifically designed for discovering rules covering examples belonging to small disjuncts, whereas a conventional decision tree algorithm is used to produce rules covering examples belonging to large disjuncts. We present results evaluating the performance of the hybrid method in 22 real-world data sets. _____________________________________________________________________________________________
Although the treatment of venous ulcers requires a set of specific knowledge, non-specialist nurses are unaware of the appropriate therapy, which is a concern in the topical therapy for these skin lesions. This paper aims to present an expert system to support the nursing decision making process in the topical therapy of venous ulcers. It is a development research implemented in five stages: system modeling, knowledge acquisition, knowledge representation from production rules, and system implementation and evaluation. The production rules are presented as well as some cases to simulate the expert system behavior, demonstrating the viability of its usage in nurse's practice. The system may support the decision making about the topical therapy of venous ulcers. However, ulcer evaluation should be correctly made, so that the system provides appropriate suggestions, allowing better organization and planning assistance.
OBJETIVO: Identificar, com o auxílio de técnicas computacionais, regras referentes às condições do ambiente físico para a classificação de microáreas de risco. MÉTODOS: Pesquisa exploratória, desenvolvida na cidade de Curitiba, PR, em 2007, dividida em três etapas: identificação de atributos para classificar uma microárea; construção de uma base de dados; e aplicação do processo de descoberta de conhecimento em base de dados, por meio da aplicação de mineração de dados. O conjunto de atributos envolveu as condições de infra- estrutura, hidrografia, solo, área de lazer, características da comunidade e existência de vetores. A base de dados foi construída com dados obtidos em entrevistas com agentes comunitários de saúde, sendo utilizado um questionário com questões fechadas, elaborado com os atributos essenciais, selecionados por especialistas. RESULTADOS: Foram identificados 49 atributos, sendo 41 essenciais e oito irrelevantes. Foram obtidas 68 regras com a mineração de dados, as quais foram analisadas sob a perspectiva de desempenho e qualidade e divididas em dois conjuntos: as inconsistentes e as que confirmam o conhecimento de especialistas. A comparação entre os conjuntos mostrou que as regras que confirmavam o conhecimento, apesar de terem desempenho computacional inferior, foram consideradas mais interessantes. CONCLUSÕES: A mineração de dados ofereceu um conjunto de regras úteis e compreensíveis, capazes de caracterizar microáreas, classificando-as quanto ao grau do risco, com base em características do ambiente físico. A utilização das regras propostas permite que a classificação de uma microárea possa ser realizada de forma mais rápida, menos subjetiva, mantendo um padrão entre as equipes de saúde, superando a influência da percepção particular de cada componente da equipe.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.