Calculation of the probability of correct classification in probabilistic bagged k-Nearest Neighbours

Villa, Joe Luis; Boqué, Ricard; Ferré, Joan

doi:10.1016/j.chemolab.2008.06.007

Cited by 4 publications

(7 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The new Bagged k-nearest neighbours is compared here to PBkNN [10], which combines kNN and bootstrap, without taking into account the uncertainty in the x. In PBkNN, for a given unknown object x t , its k-nearest neighbours in each X b are obtained.…”

Section: Probabilistic Bagged-knn (Pbknn)mentioning

confidence: 99%

“…Bagging (Bootstrap AGGregatING) is a type of ensemble method which uses bootstrap to improve the performance of the classifier [7,[9][10][11][12]. The improvement is obtained because bootstrap combined with a classification method leads to a reduction of the misclassification error [13].…”

Section: Baggingmentioning

confidence: 99%

“…The kNN classifier uses a training data matrix X, where each object is known to belong to a class c out of C possible classes [9][10][11][12]. This classifier assigns an unknown object x t , to the class to which most of the k-nearest neighbours belong.…”

Section: K-nearest Neighboursmentioning

confidence: 99%

“…These neighbours are found according to a suitable metric, usually the Euclidean distance. There are several variations of the kNN method, depending on the type of distance used [9,11] or the decision rule that is used for classification [9][10][11][12]. For kNN, the posterior probability that a given unknown object belongs to class c is given by [9]:…”

Section: K-nearest Neighboursmentioning

confidence: 99%

“…Finally, the new methodology was applied to the benchmark Wine dataset, in order to classify the different wines in three regions of origin. The classification results and reliabilities were compared to the classical kNN method [8,9] and to Probabilistic Bagged k-nearest neighbours [10].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Bagged k-nearest neighbours classification with uncertainty in the variables

Medina

Boqué

Ferré

2009

Analytica Chimica Acta

Self Cite

View full text Add to dashboard Cite

Section: Probabilistic Bagged-knn (Pbknn)mentioning

confidence: 99%

Section: Baggingmentioning

confidence: 99%

Section: K-nearest Neighboursmentioning

confidence: 99%

Section: K-nearest Neighboursmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Bagged k-nearest neighbours classification with uncertainty in the variables

Medina

Boqué

Ferré

2009

Analytica Chimica Acta

Self Cite

View full text Add to dashboard Cite

Bootstrap Approach To Compare the Slopes of Two Calibrations When Few Standards Are Available

Estévez-Pérez¹,

Wilcox

2016

Anal. Chem.

View full text Add to dashboard Cite

Comparing the slopes of aqueous-based and standard addition calibration procedures is almost a daily task in analytical laboratories. As usual protocols imply very few standards, sound statistical inference and conclusions are hard to obtain for current classical tests (e.g., the t-test), which may greatly affect decision-making. Thus, there is a need for robust statistics that are not distorted by small samples of experimental values obtained from analytical studies. Several promising alternatives based on bootstrapping are studied in this paper under the typical constraints common in laboratory work. The impact of number of standards, homoscedasticity or heteroscedasticity, three variance patterns, and three error distributions on least-squares fits were considered (in total, 144 simulation scenarios). The Student's t-test is the most valuable procedure when the normality assumption is true and homoscedasticity is present, although it can be highly affected by outliers. A wild bootstrap method leads to average rejection percentages that are closer to the nominal level in almost every situation, and it is recommended for laboratories working with a small number of standards. Finally, it was seen that the Theil-Sen percentile bootstrap statistic is very robust but its rejection percentages depart from the nominal ones (<5%), so its use is not recommended when the number of standards is very small. Finally, a tutorial and free software are given to encourage analytical laboratories to apply bootstrap principles to compare the slopes of two calibration lines.

show abstract