2020
DOI: 10.1214/19-aos1868
|View full text |Cite
|
Sign up to set email alerts
|

Local nearest neighbour classification with applications to semi-supervised learning

Abstract: We derive a new asymptotic expansion for the global excess risk of a local-k-nearest neighbour classifier, where the choice of k may depend upon the test point. This expansion elucidates conditions under which the dominant contribution to the excess risk comes from the decision boundary of the optimal Bayes classifier, but we also show that if these conditions are not satisfied, then the dominant contribution may arise from the tails of the marginal distribution of the features. Moreover, we prove that, provid… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(12 citation statements)
references
References 27 publications
0
12
0
Order By: Relevance
“…Our work here allows us to strengthen this result to the following theorem on strong universal consistency: The rates of convergence of the classification rule g n , over classes of datagenerating mechanisms satisying Hölder continuity and a strong density assumption, were established in [3], and were moreover shown to match a minimax lower bound. We remark that, even in the non-private case, the absence of the strong density assumption leads to slower rates of convergence [22,1,7].…”
Section: Consequences In Classificationmentioning
confidence: 94%
“…Our work here allows us to strengthen this result to the following theorem on strong universal consistency: The rates of convergence of the classification rule g n , over classes of datagenerating mechanisms satisying Hölder continuity and a strong density assumption, were established in [3], and were moreover shown to match a minimax lower bound. We remark that, even in the non-private case, the absence of the strong density assumption leads to slower rates of convergence [22,1,7].…”
Section: Consequences In Classificationmentioning
confidence: 94%
“…We now show that, under further regularity conditions on the data distribution P and the noise mechanism, it is possible to give a more precise description of the asymptotic error properties of the corrupted knn classifier. Since our conditions on P , which are slight simplifications of those used in Cannings et al (2018) to analyse the uncorrupted knn classifier, are a little technical, we give an informal summary of them here, deferring formal statements of our assumptions A1-A4 to just before the proof of Theorem 3 in Section A.2.…”
Section: Asymptotic Properties 41 the K-nearest Neighbour Classifiermentioning
confidence: 99%
“…The major advantage of this is that we are able to avoid the restrictive assumption of an upper bound on the -covering numbers (which would rule out non-compact domains of interest). An alternative approach to noncompact domains has been pursued by Cannings et al (2017). Whilst we follow Gadat et al (2016) in bounding the measure of the regions of the feature space where the density falls below a given value (see Assumption E), Cannings et al (2017) instead employ a moment assumption.…”
Section: Non-parametric Classification In Unbounded Domainsmentioning
confidence: 99%
“…An alternative approach to noncompact domains has been pursued by Cannings et al (2017). Whilst we follow Gadat et al (2016) in bounding the measure of the regions of the feature space where the density falls below a given value (see Assumption E), Cannings et al (2017) instead employ a moment assumption. Note that whereas Cannings et al (2017) make use of an additional set of unlabelled data to locally tune the optimal value of k, our method is optimally adaptive without any additional data.…”
Section: Non-parametric Classification In Unbounded Domainsmentioning
confidence: 99%