Nam-Hwui Kim scite author profile

Increasingly large datasets are rapidly driving up the computational costs of machine learning. Prototype generation methods aim to create a small set of synthetic observations that accurately represent a training dataset but greatly reduce the computational cost of learning from it. Assigning soft labels to prototypes can allow increasingly small sets of prototypes to accurately represent the original training dataset. Although foundational work on 'less than one'-shot learning has proven the theoretical plausibility of learning with fewer than one observation per class, developing practical algorithms for generating such prototypes remains an unexplored territory. We propose a novel, modular method for generating soft-label prototypical lines that still maintains representational accuracy even when there are fewer prototypes than the number of classes in the data. In addition, we propose the Hierarchical Soft-Label Prototype k-Nearest Neighbor classification algorithm based on these prototypical lines. We show that our method maintains high classification accuracy while greatly reducing the number of prototypes required to represent a dataset, even when working with severely imbalanced and difficult data. Our code is available at https://github.com/ilia10000/SLkNN.

show abstract

In the pursuit of sparseness: A new rank-preserving penalty for a finite mixture of factor analyzers

Kim

Browne

2021

Computational Statistics & Data Analysis

View full text Add to dashboard Cite

Mode merging for the finite mixture of t‐distributions

Kim

Browne

2021

Stat

View full text Add to dashboard Cite

Finite mixture models can be interpreted as a model representing heterogeneous subpopulations within the whole population. However, more care is needed when associating a mixture component with a cluster, because a mixture model may fit more components than the number of clusters. Modal merging via the mean shift algorithm can help identify such multicomponent clusters. So far, most of the related works are focused on the Gaussian finite mixture. As the non-Gaussian finite mixture models are gaining attention, the need to address the component-cluster correspondence issue in these mixture models grows. Thus, we introduce a mode merging method via the mean shift for the finite mixture of t-distributions and its parsimonious variants. It can be framed as an expectation-maximization algorithm and enjoys similar theoretical properties as the mean shift for the Gaussian finite mixture. The performance of our method is demonstrated via simulated and real data experiments, where it shows a competitive performance against some of the existing methods.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nam-Hwui Kim

Subspace clustering for the finite mixture of generalized hyperbolic distributions

One Line To Rule Them All: Generating LO-Shot Soft-Label Prototypes

In the pursuit of sparseness: A new rank-preserving penalty for a finite mixture of factor analyzers

Mode merging for the finite mixture of t‐distributions

Contact Info

Product

Resources

About