Erik Test scite author profile

k-means is a widely used clustering algorithm, but for k clusters and a dataset size of N , each iteration of Lloyd's algorithm costs O(kN) time. This is problematic because increasingly, applications of k-means involve both large N and large k, and there are no accelerated variants that handle this situation. To this end, we propose a dual-tree algorithm that gives the exact same results as standard k-means; when using cover trees, we bound the single-iteration runtime of the algorithm as O(N + k log k), under some assumptions. To our knowledge these are the first sub-O(kN) bounds for exact Lloyd iterations. The algorithm performs competitively in practice, especially for large N and k in low dimensions. Further, the algorithm is tree-independent, so any type of tree may be used.

show abstract

GPUSVM: a comprehensive CUDA based support vector machine package

Salman

Test

et al. 2011

View full text Add to dashboard Cite

GPUSVM (Graphic Processing Unit Support Vector Machine) is a Computing Unified Device Architecture (CUDA)based Support Vector Machine (SVM) package. It is designed to offer an end-user a fully functional and user friendly SVM tool which utilizes the power of GPUs. The core package includes an efficient cross validation tool, a fast training tool and a predicting tool. In this article, we first introduce the background theory of how we build our parallel SVM solver using CUDA programming model. Then we compare our GPUSVM package with the popular state of the art Libsvm package on several well known datasets. The preliminary results have shown one to two orders of magnitude speed improvement in both training and predicting phases compared to Libsvm using our Tesla server.

show abstract

Feature ranking using Gini index, scatter ratios, and nonlinear SVM RFE

Test

Zigic

Kecman

2013

View full text Add to dashboard Cite

Feature ranking for pattern recognition: A comparison of filter methods

Test

Kecman

Strack

et al. 2012

View full text Add to dashboard Cite

This paper presents an approach for comparing various feature ranking (FR) methods. First, six classification benchmarks are created using Exhaustive Search (ES) to select the best feature subsets. The subset selections have been done within double (nested) cross-validation procedures guaranteeing realistic accuracy predictions to unseen examples. Next, seven filter FR approaches are compared and ranked in respect to the top five best feature subsets for each data set. This paper also introduces a method for quantifying and comparing FR results. The results hint that using Gini index or scatter ratios leads to rankings closest to ES on average.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Erik Test

Parallel multitask cross validation for Support Vector Machine using GPU

Fast K-Means Algorithm Clustering

GPUSVM: a comprehensive CUDA based support vector machine package

Feature ranking using Gini index, scatter ratios, and nonlinear SVM RFE

Feature ranking for pattern recognition: A comparison of filter methods

Contact Info

Product

Resources

About