Jianhui Sun scite author profile

Negative future thinking pervades emotional disorders. This hybrid efficacy-effectiveness trial tested a four-session, scalable online cognitive bias modification program for training more positive episodic prediction. 958 adults (73.3% female, 86.5% White, 83.4% from United States) were randomized to positive conditions with ambiguous future scenarios that ended positively, 50/50 conditions that ended positively or negatively, or a control condition with neutral scenarios. As hypothesized (preregistration: https://osf.io/jrst6), positive training participants improved in negative and positive expectancy bias, self-efficacy, and optimism more than control participants, ds and 97.5% CIs = -0.57 [-0.87, -0.27], 0.79 [0.42, 1.15], 0.28 [0.02, 0.53], 0.28 [0.04, 0.51], and, for expectancy bias, more than 50/50 participants, with gains maintained at 1-month follow-up. Unexpectedly, participants across conditions improved comparably in anxiety and depression symptoms and growth mindset. Targeting a transdiagnostic process with a scalable program may improve bias and outlook; however, further validation of outcome measures is required.

show abstract

Scheduling Hyperparameters to Improve Generalization: From Centralized SGD to Asynchronous SGD

Sun

Yang

Xun

et al. 2022

ACM Trans. Knowl. Discov. Data

View full text Add to dashboard Cite

This paper 1 studies how to schedule hyperparameters to improve generalization of both centralized single-machine stochastic gradient descent (SGD) and distributed asynchronous SGD (ASGD). SGD augmented with momentum variants (e.g., heavy ball momentum (SHB) and Nesterov’s accelerated gradient (NAG)) has been the default optimizer for many tasks, in both centralized and distributed environments. However, many advanced momentum variants, despite empirical advantage over classical SHB/NAG, introduce extra hyperparameters to tune. The error-prone tuning is the main barrier for AutoML. Centralized SGD : We first focus on centralized single-machine SGD and show how to efficiently schedule the hyperparameters of a large class of momentum variants to improve generalization. We propose a unified framework called multistage quasi-hyperbolic momentum (Multistage QHM), which covers a large family of momentum variants as its special cases (e.g. vanilla SGD/SHB/NAG). Existing works mainly focus on only scheduling learning rate α ’s decay, while multistage QHM allows additional varying hyperparameters (e.g., momentum factor), and demonstrates better generalization than only tuning α . We show the convergence of multistage QHM for general nonconvex objectives. Distributed SGD : We then extend our theory to distributed asynchronous SGD (ASGD), in which a parameter server distributes data batches to several worker machines and updates parameters via aggregating batch gradients from workers. We quantify the asynchrony between different workers (i.e., gradient staleness), model the dynamics of asynchronous iterations via a stochastic differential equation (SDE), and then derive a PAC-Bayesian generalization bound for ASGD. As a byproduct, we show how a moderately large learning rate helps ASGD to generalize better. Our tuning strategies have rigorous justifications rather than a blind trial-and-error as we theoretically prove why our tuning strategies could decrease our derived generalization errors in both cases. Our strategies simplify the tuning process and beat competitive optimizers in test accuracy empirically. Our codes are publicly available https://github.com/jsycsjh/centralized-asynchronous-tuning.

show abstract

Visual question answering with attention transfer and a cross-modal gating mechanism

Sun

Liu

et al. 2020

Pattern Recognition Letters

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jianhui Sun

Recurrent Imputation for Multivariate Time Series with Missing Values

Malicious Attacks against Deep Reinforcement Learning Interpretations

GLIMA: Global and Local Time Series Imputation with Multi-directional Attention Learning

Correlation Networks for Extreme Multi-label Text Classification

A low-rank multivariate general linear model for multi-subject fMRI data and a non-convex optimization algorithm for brain response comparison

Shifting Episodic Prediction With Online Cognitive Bias Modification: A Randomized Controlled Trial

Scheduling Hyperparameters to Improve Generalization: From Centralized SGD to Asynchronous SGD

Visual question answering with attention transfer and a cross-modal gating mechanism

Contact Info

Product

Resources

About