In vitro treatments with ceftriaxone promote elimination of mutant glial fibrillary acidic protein and transcription down-regulation

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape. Local extrema with low generalization error have a large proportion of almost-zero eigenvalues in the Hessian with very few positive or negative eigenvalues. We leverage upon this observation to construct a local-entropy-based objective function that favors well-generalizable solutions lying in large flat regions of the energy landscape, while avoiding poorly-generalizable solutions located in the sharp valleys. Conceptually, our algorithm resembles two nested loops of SGD where we use Langevin dynamics in the inner loop to compute the gradient of the local entropy before each update of the weights. We show that the new objective has a smoother energy landscape and show improved generalization over SGD using uniform stability, under certain assumptions. Our experiments on convolutional and recurrent networks demonstrate that Entropy-SGD compares favorably to state-of-the-art techniques in terms of generalization error and training time.

show abstract

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

Chaudhari¹,

Choromanska²,

Soatto³

et al. 2016

Preprint

132

View full text Add to dashboard Cite

VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving

Choromanska²,

et al. 2018

View full text Add to dashboard Cite

This paper proposes a new method, that we call VisualBackProp, for visualizing which sets of pixels of the input image contribute most to the predictions made by the convolutional neural network (CNN). The method heavily hinges on exploring the intuition that the feature maps contain less and less irrelevant information to the prediction decision when moving deeper into the network. The technique we propose was developed as a debugging tool for CNN-based systems for steering self-driving cars and is therefore required to run in real-time, i.e. it was designed to require less computations than a forward propagation. This makes the presented visualization method a valuable debugging tool which can be easily used during both training and inference. We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction. Our theoretical findings stand in agreement with the experimental results. The empirical evaluation shows the plausibility of the proposed approach on the road video data as well as in other applications and reveals that it compares favorably to the layer-wise relevance propagation approach, i.e. it obtains similar visualization results and simultaneously achieves order of magnitude speed-ups.

show abstract

Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car

Bojarski¹,

Yeres²,

Choromanska³

et al. 2017

Preprint

100

View full text Add to dashboard Cite

The Loss Surfaces of Multilayer Networks

Choromanska¹,

Henaff²,

Mathieu³

et al. 2014

Preprint

View full text Add to dashboard Cite

We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band diminishes exponentially with the size of the network. We empirically verify that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band of low critical points, and that all critical points found there are local minima of high quality measured by the test error. This emphasizes a major difference between large-and small-size networks where for the latter poor quality local minima have nonzero probability of being recovered. Finally, we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting.

show abstract

Towards Automated Melanoma Detection With Deep Learning: Data Purification and Augmentation

Bisla

Choromanska

Berman

et al. 2019

View full text Add to dashboard Cite

Melanoma is one of ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion data bases, which are small, heavily imbalanced, and contain images with occlusions. We build deep-learning-based tools for data purification and augmentation to counter-act these limitations. The developed tools can be utilized in a deep learning system for lesion classification and we show how to build such system. The system heavily relies on the processing unit for removing image occlusions and the data generation unit, based on generative adversarial networks, for populating scarce lesion classes, or equivalently creating virtual patients with pre-defined types of lesions. We empirically verify our approach and show that incorporating these two units into melanoma detection system results in the superior performance over common baselines.

show abstract

Sensor modality fusion with CNNs for UGV autonomous driving in indoor environments

Patel¹,

Choromanska²,

Krishnamurthy³

et al. 2017

View full text Add to dashboard Cite

Automatic Reconstruction of Neural Morphologies with Multi-Scale Tracking

Choromanska¹,

Chang²,

Yuste³

2012

Front. Neural Circuits

View full text Add to dashboard Cite

Neurons have complex axonal and dendritic morphologies that are the structural building blocks of neural circuits. The traditional method to capture these morphological structures using manual reconstructions is time-consuming and partly subjective, so it appears important to develop automatic or semi-automatic methods to reconstruct neurons. Here we introduce a fast algorithm for tracking neural morphologies in 3D with simultaneous detection of branching processes. The method is based on existing tracking procedures, adding the machine vision technique of multi-scaling. Starting from a seed point, our algorithm tracks axonal or dendritic arbors within a sphere of a variable radius, then moves the sphere center to the point on its surface with the shortest Dijkstra path, detects branching points on the surface of the sphere, scales it until branches are well separated and then continues tracking each branch. We evaluate the performance of our algorithm on preprocessed data stacks obtained by manual reconstructions of neural cells, corrupted with different levels of artificial noise, and unprocessed data sets, achieving 90% precision and 81% recall in branch detection. We also discuss limitations of our method, such as reconstructing highly overlapping neural processes, and suggest possible improvements. Multi-scaling techniques, well suited to detect branching structures, appear a promising strategy for automatic neuronal reconstructions.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anna Choromanska

Entropy-SGD: biasing gradient descent into wide valleys

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving

Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car

The Loss Surfaces of Multilayer Networks

Towards Automated Melanoma Detection With Deep Learning: Data Purification and Augmentation

Sensor modality fusion with CNNs for UGV autonomous driving in indoor environments

Automatic Reconstruction of Neural Morphologies with Multi-Scale Tracking

Contact Info

Product

Resources

About