NICGAR: A Niching Genetic Algorithm to mine a diverse set of interesting quantitative association rules

Martín, Daniel; Alcalá‐Fdez, Jesús; Rosete, Alejandro; Herrera, Francisco

doi:10.1016/j.ins.2016.03.039

Cited by 75 publications

(32 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…First, we evaluated algorithms that were originally compared with Apriori. The results illustrate that NICGAR (Martín et al, 2016), MONPNAR (Martín et al, 2014), G3PARM (Luna et al, 2012), MDS-H (Hong & Bian, 2008), Ant-ARM (He & Hui, 2009), and SRmining (Hong-yun et al, 2008) are the fastest heuristic ARM algorithms compared to Apriori. At that point, we compared approaches that compared themselves with other heuristic approaches.…”

Section: Applications Of Heuristic Algorithmsmentioning

confidence: 96%

“…Since FP‐growth is much faster than Apriori, these algorithms could be among the fastest ones. In conclusion, according to Figure , NICGAR (Martín et al, ), MONPNAR (Martín et al, ), G3PARM (Luna et al, ), MDS‐H (Hong & Bian, ), SRmining (Hong‐yun et al, ), and Ant‐ARM (He & Hui, ) are the fastest heuristic ARM approaches, which were compared with Apriori and FP‐growth.…”

Section: Comparisonmentioning

confidence: 96%

“…After all, according to these comparisons, NICGAR (Martín et al, 2016), MONPNAR (Martín et al, 2014), G3PARM (Luna et al, 2012), MDS-H (Hong & Bian, 2008), SRmining (Hong-yun et al, 2008), PMES (Djenouri et al, 2014), Ant-ARM (He & Hui, 2009), and Kuoa et al (2011) have the potential to be the fastest heuristic ARM algorithms. However, there is a big gap in this area and all the heuristic approaches should compare themselves with each other and instead of comparing themselves with Apriori, they should try to establish comparisons with more up-to-date and faster ARM algorithms.…”

Section: Execution Timementioning

confidence: 99%

“…As it is clear in Figure 6, G3PARM (Luna et al, 2012) has the first, MONPNAR (Martín et al, 2014) has the second, and ASC (Kuo & Shih, 2007) has the third most rules reduction among other approaches. However, as NICGAR (Martín et al, 2016) generates around four times less number of rules compared to MONPNAR (Martín et al, 2014), it would be the second approach from the point of view of number of generated items or rules. ASC generates approximately 23 times less rules compared to Apriori.…”

Section: Number Of Generated Itemsets or Rulesmentioning

confidence: 99%

See 3 more Smart Citations

A survey on association rules mining using heuristics

Ghafari

Tjortjis

2019

WIREs Data Min & Knowl

View full text Add to dashboard Cite

Association rule mining (ARM) is a commonly encountred data mining method. There are many approaches to mining frequent rules and patterns from a database and one among them is heuristics. Many heuristic approaches have been proposed but, to the best of our knowledge, there is no comprehensive literature review on such approaches, yet with only a limited attempt. This gap needs to be filled. This paper reviews heuristic approaches to ARM and points out their most significant strengths and weaknesses. We propose eight performance metrics, such as execution time, memory consumption, completeness, and interestingness, we compare approaches against these performance metrics and discuss our findings. For instance, comparison results indicate that SRmining, PMES, Ant‐ARM, and MDS‐H are the fastest heuristic ARM algorithms. HSBO‐TS is the most complete one, while SRmining and ACS require only one database scan. In addition, we propose a parameter, named GT‐Rank for ranking heuristic ARM approaches, and based on that, ARMGA, ASC, and Kua emerge as the best approaches. We also consider ARM algorithms and their characteristics as transactions and items in a transactional database, respectively, and generate association rules that indicate research trends in this area. This article is categorized under: Algorithmic Development > Association Rules Technologies > Association Rules Fundamental Concepts of Data and Knowledge > Motivation and Emergence of Data Mining

show abstract

Section: Applications Of Heuristic Algorithmsmentioning

confidence: 96%

Section: Comparisonmentioning

confidence: 96%

Section: Execution Timementioning

confidence: 99%

Section: Number Of Generated Itemsets or Rulesmentioning

confidence: 99%

See 2 more Smart Citations

A survey on association rules mining using heuristics

Ghafari

Tjortjis

2019

WIREs Data Min & Knowl

View full text Add to dashboard Cite

show abstract

“…The aim of this paper is therefore to review the most widely used quality measures, describing and analyzing their properties, and providing the reader with a general knowledge of their behaviour to ease the process of selecting one or more measures when tackling an association rule mining problem. The strong point of this paper is the empirical analysis carried out, including twenty metrics, thirty datasets, and a diverse set of evolutionary algorithms that optimize a single measure 12,13,15,20,25 or multiple metrics at time 6,27 . An exhaustive search approach 9 is also considered to validate the degree of optimization achieved by the evolutionary algorithms.…”

Section: Introductionmentioning

confidence: 99%

Optimization of quality measures in association rule mining: an empirical study

Luna¹,

Ondra²,

Fardoun³

et al. 2018

IJCIS

View full text Add to dashboard Cite

In the association rule mining field many different quality measures have been proposed over time with the aim of quantifying the interestingness of each discovered rule. In evolutionary computation, many of these metrics have been used as functions to be optimized, but the selection of a set of suitable quality measures for each specific problem is not a trivial task. The aim of this paper is to review the most widely used quality measures, analyze their properties from an empirical standpoint and, as a result, ease the process of selecting a subset of them for tackling the task of mining association rules through evolutionary computation. The experimental analysis includes twenty metrics, thirty datasets and a diverse set of algorithms to describe which quality measures are related (or unrelated) so they should (or should not) be used at time. A series of recomendations are therefore provided according to which quality measures are easily optimized, what set of measures should be used to optimize the whole set of metrics, or which measures are hardly optimized by any other.

show abstract

An improved genetic algorithm with Lagrange and density method for clustering

Zhou

et al. 2020

Concurrency and Computation

View full text Add to dashboard Cite

To overcome the shortcomings of K-means clustering including clustering numbers, sensitivity to clustering center (seeds) and local optimization, this article proposes an improved genetic algorithm (GA) with a novel Lagrange-based fitness function and an initial population technique(called NicheClust algorithm); the NicheClust can determine the best chromosomes and then feeds these into K-means as initial seeds to achieve higher-quality clustering results by allowing the initial seeds to readjust in terms of clustering demands. The GA approach is proposed to search for a global optimally solution. The initial population method is presented to automatically capture the appropriate number of clusters and find the initial seeds. The Lagrange-based approach is used to prevent the fitness function from prematurely converging and capture global optimization for K-means clustering results. Experimental results based on six taxi Global Positioning System (GPS) datasets verify the higher performance of NicheClust compared to other clustering methods and validate the effectiveness with statistical analysis method. K E Y W O R D S improved genetic algorithm, initialization population technology, K-means clustering, Lagrange-based fitness function 1 INTRODUCTION Data clustering is considered to be a difficult and challenging problem in unsupervised machine learning. 1-6 There are many clustering algorithms, 7,8 of which the K-means algorithm is undoubtedly the most widely used and important due to its effectiveness and simplicity. However, K-means has a number of well-known drawbacks including sensitivity to initial cluster center, 8-10 convergence to a local optimum and difficulty of determining the number of clusters. In order to overcome these shortcomings, a variety of clustering algorithms have been proposed. Several existing techniques have been proposed for finding higher-quality initial seeds than the random initial seeds K-means chooses. 1,5,11-13 For example, the work in Reference 5 presented an efficient K-means clustering filtering algorithm using density-based initial seeds. Fast density clustering strategies based on K-means was presented in Reference 1. In addition, K-means++ 14 is typically used to address the sensitivity of the choice of the initial seeds for K-means. However, K-means++ cannot perceive distribution states of data points, resulting in seeds that have an uneven distribution and repeated calculation. Meanwhile, K-means clustering finds it difficult to obtain a globally optimal solution due to the quality of the initial seeds. 10,15,16 Therefore, in order to improve the performance and enhance the efficiency of K-means clustering, several Genetic algorithms (GAs) based K-means 17-20 have been developed in recent years. These clustering techniques produce better clustering results than simple K-means or basic GA-based clustering. The use of GA with K-means also help to avoid minima issues of K-means. 10,17,18,20 Typically, a GA-based clustering technique does not require user input regarding the ...

show abstract

NICGAR: A Niching Genetic Algorithm to mine a diverse set of interesting quantitative association rules

Cited by 75 publications

References 29 publications

A survey on association rules mining using heuristics

A survey on association rules mining using heuristics

Optimization of quality measures in association rule mining: an empirical study

An improved genetic algorithm with Lagrange and density method for clustering

Contact Info

Product

Resources

About