Nathan O. Hodas scite author profile

The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many domains, particularly in speech recognition and computer vision, to the extent that the majority of expert practitioners in those field are now regularly eschewing prior established models in favor of deep learning models. In this review, we provide an introductory overview into the theory of deep neural networks and their unique properties that distinguish them from traditional machine learning algorithms used in cheminformatics. By providing an overview of the variety of emerging applications of deep neural networks, we highlight its ubiquity and broad applicability to a wide range of challenges in the field, including QSAR, virtual screening, protein structure prediction, quantum chemistry, materials design and property prediction. In reviewing the performance of deep neural networks, we observed a consistent outperformance against nonneural networks state-of-the-art models across disparate research topics, and deep neural network based models often exceeded the "glass ceiling" expectations of their respective tasks. Coupled with the maturity of GPU-accelerated computing for training deep neural networks and the exponential growth of chemical data on which to train these networks on, we anticipate that deep learning algorithms will be a valuable tool for computational chemistry. 3 IntroductionDeep Learning is the key algorithm used in the development of AlphaGo, a Deep learning is a machine learning algorithm, not unlike those already in use in various applications in computational chemistry, from computer-aided drug design to materials property prediction. 5-8 Amongst some of its more high profile achievements include the Merck activity prediction challenge in 2012, where a deep neural network not only won the competition and outperformed Merck's internal baseline model, but did so without having a single chemist or biologist in their team. In a repeated success by a different research team, deep learning models achieved top positions in the Tox21 toxicity prediction challenge issued by NIH in 2014. 9 The unusually stellar performance of deep learning models in both predicting activity and toxicity in these recent examples, originate from the unique characteristics that distinguishes deep learning from traditional machine learning algorithms.For those unfamiliar with the intricacies of machine learning algorithms, we will highlight some of the key differences between traditional (shallow) machine learning and deep learning.The simplest example of a machine learning algorithm would be the ubiquitous least-squares linear regression. In linear regression, the underlying nature of the model is known (linear in th...

show abstract

Surface tension prevails over solute effect in organic-influenced cloud droplet activation

Ovadnevaitė

Zuend

Laaksonen

et al. 2017

Nature

262

364

View full text Add to dashboard Cite

Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter

et al. 2017

View full text Add to dashboard Cite

Pew research polls report 62 percent of U.S. adults get news on social media (Gottfried and Shearer, 2016). In a December poll, 64 percent of U.S. adults said that "made-up news" has caused a "great deal of confusion" about the facts of current events (Barthel et al., 2016). Fabricated stories in social media, ranging from deliberate propaganda to hoaxes and satire, contributes to this confusion in addition to having serious effects on global stability.In this work we build predictive models to classify 130 thousand news posts as suspicious or verified, and predict four subtypes of suspicious news -satire, hoaxes, clickbait and propaganda. We show that neural network models trained on tweet content and social network interactions outperform lexical models. Unlike previous work on deception detection, we find that adding syntax and grammar features to our models does not improve performance. Incorporating linguistic features improves classification results, however, social interaction features are most informative for finer-grained separation between four types of suspicious news posts.

show abstract

Learning Deep Neural Network Representations for Koopman Operators of Nonlinear Dynamical Systems

2019

View full text Add to dashboard Cite

The Koopman operator has recently garnered much attention for its value in dynamical systems analysis and data-driven model discovery. However, its application has been hindered by the computational complexity of extended dynamic mode decomposition; this requires a combinatorially large basis set to adequately describe many nonlinear systems of interest, e.g. cyber-physical infrastructure systems, biological networks, social systems, and fluid dynamics. Often the dictionaries generated for these problems are manually curated, requiring domain-specific knowledge and painstaking tuning. In this paper we introduce a deep learning framework for learning Koopman operators of nonlinear dynamical systems. We show that this novel method automatically selects efficient deep dictionaries, outperforming stateof-the-art methods. We benchmark this method on partially observed nonlinear systems, including the glycolytic oscillator and show it is able to predict quantitatively 100 steps into the future, using only a single timepoint, and qualitative oscillatory behavior 400 steps into the future. arXiv:1708.06850v2 [cs.LG] 17 Nov 2017 In 1931, B. O. Koopman published a paper showing that the evolution of any set of observables on a dynamical system can be expressed through the action of an infinite dimensional linear operator, the Koopman operator [1]. Because the Koopman operator is a canonical representation of any autonomous dynamical system, in principle, its use can bring to bear linear analysis methods on nonlinear systems. The Koopman operator is especially powerful for inferring properties of dynamical systems that are either partially or completely unknown or that are too complex to express using standard methods in analysis. Examples of such systems include biological networks, extremely large physical systems (which are intractable to analyze as white-box models), social networks, cyber-physical communication networks, and distributed computing systems that are subject to varying degrees of uncertainty. For this reason, the Koopman operator has gained attention as an effective tool for data-driven model discovery. The Koopman operator provides a data-driven model for comparing the asymptotic behavior of dynamical systems [3], specifically as a function of its spectra [3, 4]. Various algorithms have extended these methods using dynamic and extended dynamic mode decomposition, both for autonomous and controlled systems [5, 6, 7].The prevailing method for learning the Koopman operator from data is dynamic mode decomposition [4]. Dynamic mode decomposition is the process of identifying a linear operator from temporally or spatially-linked data, ultimately with the objective of characterizing the spectrum of the operator.There are many variants of dynamic mode decomposition, but the most recent advances in Koopman operator learning have emerged from extended dynamic mode decomposition [5].In extended dynamic mode decomposition, the idea is to lift the set of system observables from its native vector space into a higher di...

show abstract

The Simple Rules of Social Contagion

Hodas¹,

Lerman²

2014

Sci Rep

204

157

View full text Add to dashboard Cite

It is commonly believed that information spreads between individuals like a pathogen, with each exposure by an informed friend potentially resulting in a naive individual becoming infected. However, empirical studies of social media suggest that individual response to repeated exposure to information is far more complex. As a proxy for intervention experiments, we compare user responses to multiple exposures on two different social media sites, Twitter and Digg. We show that the position of exposing messages on the user-interface strongly affects social contagion. Accounting for this visibility significantly simplifies the dynamics of social contagion. The likelihood an individual will spread information increases monotonically with exposure, while explicit feedback about how many friends have previously spread it increases the likelihood of a response. We provide a framework for unifying information visibility, divided attention, and explicit social feedback to predict the temporal dynamics of user behavior.

show abstract

How Visibility and Divided Attention Constrain Social Contagion

Hodas¹,

Lerman²

2012

128

126

View full text Add to dashboard Cite

Abstract-How far and how fast does information spread in social media? Researchers have recently examined a number of factors that affect information diffusion in online social networks, including: the novelty of information, users' activity levels, who they pay attention to, and how they respond to friends' recommendations. Using URLs as markers of information, we carry out a detailed study of retweeting, the primary mechanism by which information spreads on the Twitter follower graph. Our empirical study examines how users respond to an incoming stimulus, i.e., a tweet (message) from a friend, and reveals that dynamically decaying visibility, which is the increasing cognitive effort required for discovering and acting upon a tweet, combined with limited attention play dominant roles in retweeting behavior. Specifically, we observe that users retweet information when it is most visible, such as when it near the top of their Twitter feed. Moreover, our measurements quantify how a user's limited attention is divided among incoming tweets, providing novel evidence that highly connected individuals are less likely to propagate an arbitrary tweet. Our study indicates that the finite ability to process incoming information constrains social contagion, and we conclude that rapid decay of visibility is the primary barrier to information propagation online.

show abstract

Aerosol Liquid Water Driven by Anthropogenic Nitrate: Implications for Lifetimes of Water-Soluble Organic Gases and Potential for Secondary Organic Aerosol Formation

Hodas

Sullivan

Skog

et al. 2014

Environ. Sci. Technol.

101

112

View full text Add to dashboard Cite

Aerosol liquid water (ALW) influences aerosol radiative properties and the partitioning of gas-phase water-soluble organic compounds (WSOC g ) to the condensed phase. A recent modeling study drew attention to the anthropogenic nature of ALW in the southeastern United States, where predicted ALW is driven by regional sulfate. Herein, we demonstrate that ALW in the Po Valley, Italy, is also anthropogenic but is driven by locally formed nitrate, illustrating regional differences in the aerosol components responsible for ALW. We present field evidence for the influence of controllable ALW on the lifetimes and atmospheric budgets of reactive organic gases and note the role of ALW in the formation of secondary organic aerosol (SOA). Nitrate is expected to increase in importance due to increased emissions of nitrate precursors, as well as policies aimed at reducing sulfur emissions. We argue that the impacts of increased particulate nitrate in future climate and air quality scenarios may be under predicted because they do not account for the increased potential for SOA formation in nitrate-derived ALW, nor do they account for the impacts of this ALW on reactive gas budgets and gas-phase photochemistry.

show abstract

Influence of particle-phase state on the hygroscopic behavior of mixed organic–inorganic aerosols

et al. 2015

View full text Add to dashboard Cite

Abstract. Recent work has demonstrated that organic and mixed organic-inorganic particles can exhibit multiple phase states depending on their chemical composition and on ambient conditions such as relative humidity (RH). To explore the extent to which water uptake varies with particlephase behavior, hygroscopic growth factors (HGFs) of nine laboratory-generated, organic and organic-inorganic aerosol systems with physical states ranging from well-mixed liquids to phase-separated particles to viscous liquids or semi-solids were measured with the Differential Aerosol Sizing and Hygroscopicity Spectrometer Probe at RH values ranging from 40 to 90 %. Water-uptake measurements were accompanied by HGF and RH-dependent thermodynamic equilibrium calculations using the Aerosol Inorganic-Organic Mixtures Functional groups Activity Coefficients (AIOMFAC) model. In addition, AIOMFAC-predicted growth curves are compared to several simplified HGF modeling approaches: (1) representing particles as ideal, well-mixed liquids; (2) forcing a single phase but accounting for non-ideal interactions through activity coefficient calculations; and (3) a Zdanovskii-Stokes-Robinson-like calculation in which complete separation of the inorganic and organic components is assumed at all RH values, with water uptake treated separately in each of the individual phases. We observed variability in the characteristics of measured hygroscopic growth curves across aerosol systems with differing phase behaviors, with growth curves approaching smoother, more continuous water uptake with decreasing prevalence of liquidliquid phase separation and increasing oxygen : carbon ratios of the organic aerosol components. We also observed indirect evidence for the dehydration-induced formation of highly viscous semi-solid phases and for kinetic limitations to the crystallization of ammonium sulfate at low RH for sucrosecontaining particles. AIOMFAC-predicted growth curves are generally in good agreement with the HGF measurements. The performances of the simplified modeling approaches, however, differ for particles with differing phase states. This suggests that no single simplified modeling approach can be used to capture the water-uptake behavior for the diversity of particle-phase behavior expected in the atmosphere. Errors in HGFs calculated with the simplified models are of sufficient magnitude to produce substantial errors in estimates of particle optical and radiative properties, particularly for the assumption that water uptake is driven by absorptive equilibrium partitioning with ideal particle-phase mixing.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nathan O. Hodas

Deep learning for computational chemistry

Surface tension prevails over solute effect in organic-influenced cloud droplet activation

Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter

Learning Deep Neural Network Representations for Koopman Operators of Nonlinear Dynamical Systems

The Simple Rules of Social Contagion

How Visibility and Divided Attention Constrain Social Contagion

Aerosol Liquid Water Driven by Anthropogenic Nitrate: Implications for Lifetimes of Water-Soluble Organic Gases and Potential for Secondary Organic Aerosol Formation

Influence of particle-phase state on the hygroscopic behavior of mixed organic–inorganic aerosols

Contact Info

Product

Resources

About