Stanisław Jastrzębski scite author profile

We present a deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200,000 exams (over 1,000,000 images). Our network achieves an AUC of 0.895 in predicting whether there is a cancer in the breast, when tested on the screening population. We attribute the high accuracy of our model to a two-stage training procedure, which allows us to use a very high-capacity patch-level network to learn from pixel-level labels alongside a network learning from macroscopic breast-level labels. To validate our model, we conducted a reader study with 14 readers, each reading 720 screening mammogram exams, and find our model to be as accurate as experienced radiologists when presented with the same data. Finally, we show that a hybrid model, averaging probability of malignancy predicted by a radiologist with a prediction of our neural network, is more accurate than either of the two separately. To better understand our results, we conduct a thorough analysis of our network's performance on different subpopulations of the screening population, model design, training procedure, errors, and properties of its internal representations.deep learning | deep convolutional neural networks | breast cancer screening | mammography B reast cancer is the second leading cancer-related cause of death among women in the US. In 2014, over 39 million screening and diagnostic mammography exams were performed in the US. It is estimated that in 2015 232,000 women were diagnosed with breast cancer and approximately 40,000 died from it (1). Although mammography is the only imaging test that has reduced breast cancer mortality (2-4), there has been discussion regarding the potential harms of screening, including false positive recalls and associated false positive biopsies. The vast majority of the 10-15% of women asked to return following an inconclusive screening mammogram undergo another mammogram and/or ultrasound for clarification. After the additional imaging exams, many of these findings are determined as benign and only 10-20% are recommended to undergo a needle biopsy for further work-up. Among these, only 20-40% yield a diagnosis of cancer (5). Evidently, there is an unmet need to shift the balance of routine breast cancer screening towards more benefit and less harm.Traditional computer-aided detection (CAD) in mammography is routinely used by radiologists to assist with image interpretation, despite multicenter studies showing these CAD programs do not improve their diagnostic performance (6).These CAD programs typically use handcrafted features to mark sites on a mammogram that appear distinct from normal tissue structures. The radiologist decides whether to recall these findings, determining clinical significance and actionability. Recent developments in deep learning (7)-in particular, deep convolutional neural networks (CNNs) (8-12)-open possibilities for creating a new generation of CAD-like tools.This paper makes several contributions. Primarily, we train and evaluate a set of stro...

show abstract

Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits

Sacha¹,

Błaż²,

Byrski³

et al. 2021

J. Chem. Inf. Model.

100

View full text Add to dashboard Cite

The central challenge in automated synthesis planning is to be able to generate and predict outcomes of a diverse set of chemical reactions. In particular, in many cases, the most likely synthesis pathway cannot be applied due to additional constraints, which requires proposing alternative chemical reactions. With this in mind, we present Molecule Edit Graph Attention Network (MEGAN), an end-to-end encoder−decoder neural model. MEGAN is inspired by models that express a chemical reaction as a sequence of graph edits, akin to the arrow pushing formalism. We extend this model to retrosynthesis prediction (predicting substrates given the product of a chemical reaction) and scale it up to large data sets. We argue that representing the reaction as a sequence of edits enables MEGAN to efficiently explore the space of plausible chemical reactions, maintaining the flexibility of modeling the reaction in an end-to-end fashion and achieving state-of-the-art accuracy in standard benchmarks. Code and trained models are made available online at https://github. com/molecule-one/megan.

show abstract

An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department

Shamout

Shen

et al. 2021

npj Digit. Med.

View full text Add to dashboard Cite

During the coronavirus disease 2019 (COVID-19) pandemic, rapid and accurate triage of patients at the emergency department is critical to inform decision-making. We propose a data-driven approach for automatic prediction of deterioration risk using a deep neural network that learns from chest X-ray images and a gradient boosting model that learns from routine clinical variables. Our AI prognosis system, trained using data from 3661 patients, achieves an area under the receiver operating characteristic curve (AUC) of 0.786 (95% CI: 0.745–0.830) when predicting deterioration within 96 hours. The deep neural network extracts informative areas of chest X-ray images to assist clinicians in interpreting the predictions and performs comparably to two radiologists in a reader study. In order to verify performance in a real clinical setting, we silently deployed a preliminary version of the deep neural network at New York University Langone Health during the first wave of the pandemic, which produced accurate predictions in real-time. In summary, our findings demonstrate the potential of the proposed system for assisting front-line physicians in the triage of COVID-19 patients.

show abstract

Osprey: Hyperparameter Optimization for Machine Learning

McGibbon¹,

Hernández²,

Harrigan³

et al. 2016

JOSS

View full text Add to dashboard Cite

Osprey is a tool for hyperparameter optimization of machine learning algorithms in Python. Hyperparameter optimization can often be an onerous process for researchers, due to timeconsuming experimental replicates, non-convex objective functions, and constant tension between exploration of global parameter space and local optimization (Jones, Schonlau, and Welch 1998). We've designed Osprey to provide scientists with a practical, easyto-use way of finding optimal model parameters. The software works seamlessly with scikit-learn estimators (Pedregosa et al. 2011) and supports many different search strategies for choosing the next set of parameters with which to evaluate a given model, including gaussian processes (GPy 2012), tree-structured Parzen estimators (Yamins, Tax, and Bergstra 2013), as well as random and grid search. As hyperparameter optimization is an embarrassingly parallel problem, Osprey can easily scale to hundreds of concurrent processes by executing a simple command-line program multiple times. This makes it easy to exploit large resources available in high-performance computing environments. Osprey is actively maintained by researchers at Stanford University and other institutions around the world. While originally developed to analyze computational protein dynamics (McGibbon, Harrigan, et al. 2016), it is applicable to any scikit-learn-compatible pipeline. The source code for Osprey is hosted on GitHub and has been archived to Zenodo (McGibbon, Hernández, et al. 2016). Full documentation can be found at http: //msmbuilder.org/osprey.

show abstract

Residual Connections Encourage Iterative Inference

Jastrzębski¹,

Arpit²,

Ballas³

et al. 2017

Preprint

View full text Add to dashboard Cite

Residual networks (Resnets) have become a prominent architecture in deep learning. However, a comprehensive understanding of Resnets is still a topic of ongoing research. A recent view argues that Resnets perform iterative refinement of features. We attempt to further expose properties of this aspect. To this end, we study Resnets both analytically and empirically. We formalize the notion of iterative refinement in Resnets by showing that residual connections naturally encourage features of residual blocks to move along the negative gradient of loss as we go from one block to the next. In addition, our empirical analysis suggests that Resnets are able to perform both representation learning and iterative refinement. In general, a Resnet block tends to concentrate representation learning behavior in the first few layers while higher layers perform iterative refinement of features. Finally we observe that sharing residual layers naively leads to representation explosion and counterintuitively, overfitting, and we show that simple existing strategies can help alleviating this problem.

show abstract

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Phang

Park

et al. 2019

Preprint

View full text Add to dashboard Cite

Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio

Jastrzębski

Kenton

Arpit

et al. 2018

View full text Add to dashboard Cite

show abstract

Emulating Docking Results Using a Deep Neural Network: A New Perspective for Virtual Screening

Jastrzębski

Szymczak

Pocha

et al. 2020

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Docking is one of the most important steps in virtual screening pipelines, and it is an established method for examining potential interactions between ligands and receptors. However, this method is computationally expensive, and it is often among the last steps of the process of compound libraries evaluation. In this work, we investigate the feasibility of learning a deep neural network to predict the docking output directly from a two-dimensional compound structure. The developed protocol is orders of magnitude faster than typical docking software, and it returns ligand−receptor complexes encoded in the form of the interaction fingerprint. Its speed and efficiency unlock the application possibilities, such as screening compound libraries of vast size on the basis of contact patterns or docking score (derived on the basis of predicted interaction schemes). We tested our approach on several G protein-coupled receptor targets and 4 CYP enzymes in retrospective virtual screening experiments, and a variant of graph convolutional network appeared to be most effective in emulating docking results. The method can be easily used by the community based on the code available in the Supporting Information.

show abstract

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.