Phosphorylation-dependent C-terminal Binding of 14-3-3 Proteins Promotes Cell Surface Expression of HIV Co-receptor GPR15

Machine learning (ML) is now used in many areas of astrophysics, from detecting exoplanets in Kepler transit signals to removing telescope systematics. Recent work demonstrated the potential of using ML algorithms for atmospheric retrieval by implementing a random forest (RF) to perform retrievals in seconds that are consistent with the traditional, computationally expensive nested-sampling retrieval method. We expand upon their approach by presenting a new ML model, plan-net, based on an ensemble of Bayesian neural networks (BNNs) that yields more accurate inferences than the RF for the same data set of synthetic transmission spectra. We demonstrate that an ensemble provides greater accuracy and more robust uncertainties than a single model. In addition to being the first to use BNNs for atmospheric retrieval, we also introduce a new loss function for BNNs that learns correlations between the model outputs. Importantly, we show that designing ML models to explicitly incorporate domain-specific knowledge both improves performance and provides additional insight by inferring the covariance of the retrieved atmospheric parameters. We apply plan-net to the Hubble Space Telescope Wide Field Camera 3 transmission spectrum for WASP-12b and retrieve an isothermal temperature and water abundance consistent with the literature. We highlight that our method is flexible and can be expanded to higher-resolution spectra and a larger number of atmospheric parameters.

show abstract

Accurate Machine-learning Atmospheric Retrieval via a Neural-network Surrogate Model for Radiative Transfer

Himes¹,

Harrington²,

Cobb³

et al. 2022

Planet. Sci. J.

View full text Add to dashboard Cite

Atmospheric retrieval determines the properties of an atmosphere based on its measured spectrum. The low signal-to-noise ratios of exoplanet observations require a Bayesian approach to determine posterior probability distributions of each model parameter, given observed spectra. This inference is computationally expensive, as it requires many executions of a costly radiative transfer (RT) simulation for each set of sampled model parameters. Machine learning (ML) has recently been shown to provide a significant reduction in runtime for retrievals, mainly by training inverse ML models that predict parameter distributions, given observed spectra, albeit with reduced posterior accuracy. Here we present a novel approach to retrieval by training a forward ML surrogate model that predicts spectra given model parameters, providing a fast approximate RT simulation that can be used in a conventional Bayesian retrieval framework without significant loss of accuracy. We demonstrate our method on the emission spectrum of HD 189733 b and find good agreement with a traditional retrieval from the Bayesian Atmospheric Radiative Transfer (BART) code (Bhattacharyya coefficients of 0.9843–0.9972, with a mean of 0.9925, between 1D marginalized posteriors). This accuracy comes while still offering significant speed enhancements over traditional RT, albeit not as much as ML methods with lower posterior accuracy. Our method is ∼9× faster per parallel chain than BART when run on an AMD EPYC 7402P central processing unit (CPU). Neural-network computation using an NVIDIA Titan Xp graphics processing unit is 90×–180× faster per chain than BART on that CPU.

show abstract

Humbug Zooniverse: A Crowd-Sourced Acoustic Mosquito Dataset

Kiskin

Cobb

Wang

et al. 2020

View full text Add to dashboard Cite

Mosquitoes are the only known vector of malaria, which leads to hundreds of thousands of deaths each year. Understanding the number and location of potential mosquito vectors is of paramount importance to aid the reduction of malaria transmission cases. In recent years, deep learning has become widely used for bioacoustic classification tasks. In order to enable further research applications in this field, we release a new dataset of mosquito audio recordings. With over a thousand contributors, we obtained 195,434 labels of two second duration, of which approximately 10 percent signify mosquito events. We present an example use of the dataset, in which we train a convolutional neural network on log-Mel features, showcasing the information content of the labels. We hope this will become a vital resource for those researching all aspects of malaria, and add to the existing audio datasets for bioacoustic detection and signal processing.

show abstract

Improving Differential Evolution through Bayesian Hyperparameter Optimization

Biswas

Saha

et al. 2021

View full text Add to dashboard Cite

Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks with Symmetric Splitting

Cobb¹,

Jalaian²

2020

Preprint

View full text Add to dashboard Cite

Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) approach that exhibits favourable exploration properties in high-dimensional models such as neural networks. Unfortunately, HMC has limited use in large-data regimes and little work has explored suitable approaches that aim to preserve the entire Hamiltonian. In our work, we introduce a new symmetric integration scheme for split HMC that does not rely on stochastic gradients. We show that our new formulation is more efficient than previous approaches and is easy to implement with a single GPU. As a result, we are able to perform full HMC over common deep learning architectures using entire data sets. In addition, when we compare with stochastic gradient MCMC, we show that our method achieves better performance in both accuracy and uncertainty quantification. Our approach demonstrates HMC as a feasible option when considering inference schemes for large-scale machine learning problems.

show abstract

URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural Networks

Vadera¹,

Cobb²,

Jalaian³

et al. 2020

Preprint

View full text Add to dashboard Cite

While deep learning methods continue to improve in predictive accuracy on a wide range of application domains, significant issues remain with other aspects of their performance including their ability to quantify uncertainty and their robustness. Recent advances in approximate Bayesian inference hold significant promise for addressing these concerns, but the computational scalability of these methods can be problematic when applied to large-scale models. In this paper, we describe initial work on the development of URSABench (the Uncertainty, Robustness, Scalability, and Accuracy Benchmark), an open-source suite of benchmarking tools for comprehensive assessment of approximate Bayesian inference methods with a focus on deep learningbased classification tasks. 1

show abstract

Identifying Sources and Sinks in the Presence of Multiple Agents with Gaussian Process Vector Calculus

Cobb

Everett

Markham

et al. 2018

View full text Add to dashboard Cite

In systems of multiple agents, identifying the cause of observed agent dynamics is challenging. Often, these agents operate in diverse, non-stationary environments, where models rely on handcrafted environment-specific features to infer influential regions in the system's surroundings. To overcome the limitations of these inflexible models, we present GP-LAPLACE, a technique for locating sources and sinks from trajectories in time-varying fields. Using Gaussian processes, we jointly infer a spatio-temporal vector field, as well as canonical vector calculus operations on that field. Notably, we do this from only agent trajectories without requiring knowledge of the environment, and also obtain a metric for denoting the significance of inferred causal features in the environment by exploiting our probabilistic method. To evaluate our approach, we apply it to both synthetic and real-world GPS data, demonstrating the applicability of our technique in the presence of multiple agents, as well as its superiority over existing methods.

show abstract

Automatic Acoustic Mosquito Tagging with Bayesian Neural Networks

Kiskin

Cobb

Sinka

et al. 2021

View full text Add to dashboard Cite

Deep learning models are now widely used in decision-making applications. These models must be robust to noise and carefully map to the underlying uncertainty in the data. Standard deterministic neural networks are well known to be poor at providing reliable estimates of uncertainty and often lack the robustness that is required for real-world deployment. In this paper, we work with an application that requires accurate uncertainty estimates in addition to good predictive performance. In particular, we consider the task of detecting a mosquito from its acoustic signature. We use Bayesian neural networks (BNNs) to infer predictive distributions over outputs and incorporate this uncertainty as part of an automatic labelling process. We demonstrate the utility of BNNs by performing the first fully automated data collection procedure to identify acoustic mosquito data on over 1,500 hours of unlabelled field data collected with low-cost smartphones in Tanzania. We use uncertainty metrics such as predictive entropy and mutual information to help with the labelling process. We show how to bridge the gap between theory and practice by describing our pipeline from data preprocessing to model output visualisation. Additionally, we supply all of our data and code. The successful autonomous detection of mosquitoes allows us to perform analysis which is critical to the project goals of tackling mosquito-borne diseases such as malaria and dengue fever.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Adam D. Cobb

An Ensemble of Bayesian Neural Networks for Exoplanetary Atmospheric Retrieval

Accurate Machine-learning Atmospheric Retrieval via a Neural-network Surrogate Model for Radiative Transfer

Humbug Zooniverse: A Crowd-Sourced Acoustic Mosquito Dataset

Improving Differential Evolution through Bayesian Hyperparameter Optimization

Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks with Symmetric Splitting

URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural Networks

Identifying Sources and Sinks in the Presence of Multiple Agents with Gaussian Process Vector Calculus

Automatic Acoustic Mosquito Tagging with Bayesian Neural Networks

Contact Info

Product

Resources

About