Xuetong Wu scite author profile

Transfer learning, or domain adaptation, is concerned with machine learning problems in which training and testing data come from possibly different distributions (denoted as µ and µ , respectively). In this work, we give an informationtheoretic analysis on the generalization error and the excess risk of transfer learning algorithms, following a line of work initiated by Russo and Zhou. Our results suggest, perhaps as expected, that the Kullback-Leibler (KL) divergence D(µ||µ ) plays an important role in characterizing the generalization error in the settings of domain adaptation. Specifically, we provide generalization error upper bounds for general transfer learning algorithms, and extend the results to a specific empirical risk minimization (ERM) algorithm where data from both distributions are available in the training phase. We further apply the method to iterative, noisy gradient descent algorithms, and obtain upper bounds which can be easily calculated, only using parameters from the learning algorithms. A few illustrative examples are provided to demonstrate the usefulness of the results. In particular, our bound is tighter in specific classification problems than the bound derived using Rademacher complexity.

show abstract

Imputation techniques on missing values in breast cancer treatment and fertility data

Khorshidi

Aickelin

et al. 2019

Health Inf Sci Syst

View full text Add to dashboard Cite

Clinical decision support using data mining techniques offers more intelligent ways to reduce decision errors in the last few years. However, clinical datasets often suffer from high missingness, which adversely impacts the quality of modelling if handled improperly. Imputing missing values provides an opportunity to resolve the issue. Conventional imputation methods adopt simple statistical methods, such as mean imputation or discarding missing cases, which have many limitations and thus degrade the performance of learning. This study examines a series of machine learning based imputation methods and suggests an efficient approach for preparing a good quality breast cancer dataset, to find the relationship between breast cancer treatment and chemotherapy-related amenorrhoea, where the performance is evaluated by the accuracy of the prediction. To this end, the reliability and robustness of six well-known imputation methods are evaluated. Our results show that imputation leads to a significant boost in the classification performance compared to the model prediction based on list-wise deletion. Furthermore, the results reveal that most methods gain strong robustness and discriminant power even when the dataset experiences high missing rates (> 50%).

show abstract

Fast Rate Generalization Error Bounds: Variations on a Theme

Manton

Aickelin

et al. 2022

View full text Add to dashboard Cite

A Bayesian approach to (online) transfer learning: Theory and algorithms

Manton

Aickelin

et al. 2023

Artificial Intelligence

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xuetong Wu

Information-theoretic analysis for transfer learning

Information-theoretic analysis for transfer learning

Imputation techniques on missing values in breast cancer treatment and fertility data

Fast Rate Generalization Error Bounds: Variations on a Theme

A Bayesian approach to (online) transfer learning: Theory and algorithms

Contact Info

Product

Resources

About