Exascale Deep Learning to Accelerate Cancer Research

Patton, Robert M.; Abousamra, Shahira; Samaras, Dimitris; Saltz, Joel; Johnston, Travis; Young, Steven R.; Schuman, Catherine D.; Potok, Thomas E.; Rose, Derek; Lim, Seung-Hwan; Chae, Junghoon; Hou, Le

doi:10.1109/bigdata47090.2019.9006467

Cited by 17 publications

(2 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…MENNDL's built-in training termination leverages truncated training in conjunction with a dynamic early termination criterion. It ends training at 20 epochs, or earlier if loss is stable over the past 10 epochs, making this NAS implementation one of the most effective on HPC systems [1], [4]. We put PENGUIN to the test by comparing the actual walltime of training our 6,000 NNs using MENNDL with two termination scenarios: (i) with MENNDL's built-in training termination; and (ii) with PENGUIN augmenting the termination decision.…”

Section: Walltime Speedupmentioning

confidence: 99%

“…N EURAL networks (NN) are powerful models that are increasingly used in traditional high-performance computing (HPC) scientific simulations and new research areas, such as high-performance artificial intelligence and high-throughput data analytics, to solve problems in physics [1], materials science [2], neuroscience [3], and medical imaging [4] among other domains. Finding suitable NNs is a time-consuming process involving several rounds of hyperparameter selection, training, validation, and manual inspection.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Building High-throughput Neural Architecture Search Workflows via a Decoupled Fitness Prediction Engine

Rorabaugh

Caíno‐Lores

Johnston

et al. 2022

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Neural networks (NN) are used in high-performance computing and high-throughput analysis to extract knowledge from datasets. Neural architecture search (NAS) automates NN design by generating, training, and analyzing thousands of NNs. However, NAS requires massive computational power for NN training. To address challenges of efficiency and scalability, we propose PENGUIN, a decoupled fitness prediction engine that informs the search without interfering in it. PENGUIN uses parametric modeling to predict fitness of NNs. Existing NAS methods and parametric modeling functions can be plugged into PENGUIN to build flexible NAS workflows. Through this decoupling and flexible parametric modeling, PENGUIN reduces training costs: it predicts the fitness of NNs, enabling NAS to terminate training NNs early. Early termination increases the number of NNs that fixed compute resources can evaluate, thus giving NAS additional opportunity to find better NNs. We assess the effectiveness of our engine on 6,000 NNs across three diverse benchmark datasets and three state of the art NAS implementations using the Summit supercomputer. Augmenting these NAS implementations with PENGUIN can increase throughput by a factor of 1.6 to 7.1. Furthermore, walltime tests indicate that PENGUIN can reduce training time by a factor of 2.5 to 5.3.

show abstract

Section: Walltime Speedupmentioning

confidence: 99%