A genetic algorithm using hyper-quadtrees for low-dimensional k-means clustering

Laszlo, Michael; Mukherjee, Sumitra

doi:10.1109/tpami.2006.66

Cited by 98 publications

(28 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The algorithms appear to be scalable for larger values of k due to increasing sparsity of discs in the auxiliary problems. It is worthwhile to mention that some of the state-of-art heuristics proposed in [11,24,36,37,47,59] did not report the optimal solutions found here for the Reinelt's drilling data set with n = 1060 entities and k = 120, 150. To the best of our knowledge, this is the first time that such solutions are reported in the literature.…”

Section: Results In the Planementioning

confidence: 91%

An improved column generation algorithm for minimum sum-of-squares clustering

2010

View full text Add to dashboard Cite

Given a set of entities associated with points in Euclidean space, minimum sum-of-squares clustering (MSSC) consist in partitioning this set into clusters such that the sum of squared distances from each point to the centroid of its cluster is minimized. A column generation algorithm for MSSC was given in du Merle et al. [15]. The bottleneck of that algorithm is resolution of the auxiliary problem of finding a column with negative reduced cost. We propose a new way to solve this auxiliary problem based on geometric arguments. This greatly improves the efficiency of the whole algorithm and leads to exact solution of instances with over 2300 entities, i.e., more than 10 times as much as previously done.

show abstract

Section: Results In the Planementioning

confidence: 91%

An improved column generation algorithm for minimum sum-of-squares clustering

2010

View full text Add to dashboard Cite

show abstract

“…It analyses the relationship between the ependent or response variable and independent or predictor variables. The relationship is expressed in the form of an equation that predicts the response variable as a linear function of predictor variable [7][8][9][10]. Linear Regression: Y=a+bX+u.…”

Section: Regressionmentioning

confidence: 99%

Data Mining Techniques in Software Defect Prediction

Periasamy¹,

Mishbahulhuda²

2017

IJARCSSE

View full text Add to dashboard Cite

Abstract-Software defect prediction work focuses on the number of defects remaining in a software system. The software defect prediction model helps in early detection of defects and contributes to their efficient removal and producing a quality software system based on several metrics. A prediction of the number of remaining defects in an inspected are fact can be used for decision making. An accurate prediction of the number of defects in a software product during system testing contributes not only to the management of the system testing process but also to the estimation of the product's required maintenance. Defective software modules cause software failures, increase development and maintenance costs, and decrease customer satisfaction. It strives to improve software quality and testing efficiency by constructing predictive models from code attributes to enable a timely identification of faultprone modules. The main objective of paper is to help developers identify defects based on existing software metrics using data mining techniques and thereby improve the software quality. In this paper, we will discuss data mining techniques that are association mining, classification and clustering for software defect prediction. This helps the developers to detect software defects and correct them.

show abstract

“…This includes simulated annealing [21], evolutionary algorithms [22], [24], [18], tabu search [11], and ant colony optimization [6]. Also, hybrid approaches that combine multiple algorithms have been proposed in literature [24], [22].…”

Section: Related Workmentioning

confidence: 99%

“…The data are taken from German Town Data, which is a twodimensional data set with 59 observations, obtained from [35]. The SSE value for KMeans clustering for five clusters is the reported minimum value in literature [24].…”

Section: Definition Of Strategymentioning

confidence: 99%

A Game Theoretic Approach for Simultaneous Compaction and Equipartitioning of Spatial Data Sets

Gupta

Ranganathan

2010

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Abstract-Data and object clustering techniques are used in a wide variety of scientific applications such as biology, pattern recognition, information systems, etc. Traditionally, clustering methods have focused on optimizing a single metric, however, several multidisciplinary applications such as robot team deployment, ad hoc networks, facility location, etc., require the simultaneous examination of multiple metrics during clustering. In this paper, we propose a novel approach for spatial data clustering based on the concepts of microeconomic theory, which can simultaneously optimize both the compaction and the equipartitioning objectives. The algorithm models a multistep, normal form game consisting of randomly initialized clusters as players that compete for the allocation of data objects from resource locations. A Nash-equilibrium-based methodology is used to derive solutions that are socially fair for all the players. After each step, the clusters are updated using the KMeans algorithm, and the process is repeated until the stopping criteria are satisfied. Extensive simulations were performed on several real data sets as well as artificially synthesized data sets to evaluate the efficacy of the algorithm. Experimental results indicate that the proposed algorithm yields significantly better results as compared to the traditional algorithms. Further, the proposed algorithm yields a high value of fairness, a metric that indicates the quality of the solution in terms of simultaneous optimization of the objectives. Also, the sensitivity of the various design parameters on the performance of our algorithm is analyzed and reported.

show abstract

A genetic algorithm using hyper-quadtrees for low-dimensional k-means clustering

Cited by 98 publications

References 14 publications

An improved column generation algorithm for minimum sum-of-squares clustering

An improved column generation algorithm for minimum sum-of-squares clustering

Data Mining Techniques in Software Defect Prediction

A Game Theoretic Approach for Simultaneous Compaction and Equipartitioning of Spatial Data Sets

Contact Info

Product

Resources

About