2021
DOI: 10.1615/jmachlearnmodelcomput.2021037052
|View full text |Cite
|
Sign up to set email alerts
|

Fully Convolutional Spatio-Temporal Models for Representation Learning in Plasma Science

Abstract: We have trained a fully convolutional spatio-temporal model for fast and accurate representation learning in the challenging exemplar application area of fusion energy plasma science. The onset of major disruptions is a critically important fusion energy science issue that must be resolved for advanced tokamak plasmas such as the $25B burning plasma international thermonuclear experimental reactor (ITER) experiment. While a variety of statistical methods have been used to address the problem of tokamak disrupt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 34 publications
0
8
0
Order By: Relevance
“…As explained there, all inputs are normalized to the same order of magnitude, the variance of the input quantities is appropriately scaled, and the values submitted during real‐time are correlated with those trained offline. In particular, the 1D inputs are processed by a set of convolutional neural nets and then concatenated with the 0D inputs to form the input features for the long short‐term memory (LSTM) network as well as for the temporal convolutional neural (TCN) network discussed in Reference [10] The outputs of the LSTM and the TCN in these studies provide complementary benchmarks for the disruption score indicating proximity of the coming disruption event.…”
Section: Implementation Of Frnn Deep Learning Based Model Into Diii‐d...mentioning
confidence: 99%
See 1 more Smart Citation
“…As explained there, all inputs are normalized to the same order of magnitude, the variance of the input quantities is appropriately scaled, and the values submitted during real‐time are correlated with those trained offline. In particular, the 1D inputs are processed by a set of convolutional neural nets and then concatenated with the 0D inputs to form the input features for the long short‐term memory (LSTM) network as well as for the temporal convolutional neural (TCN) network discussed in Reference [10] The outputs of the LSTM and the TCN in these studies provide complementary benchmarks for the disruption score indicating proximity of the coming disruption event.…”
Section: Implementation Of Frnn Deep Learning Based Model Into Diii‐d...mentioning
confidence: 99%
“…[6,7] It is particularly noteworthy that in addressing this long-standing challenge, neural networks, which were considered for decades, [8] have recently taken a dramatic step forward with the advent of more powerful artificial intelligence approaches enabled by the rapid advances in high-performance computing technology at major supercomputing centers. For example, deep learning models based on the long-short term memory (LSTM) recurrent neural network (RNN) [9] and temporal convolutional neural networks (TCN) [10] have achieved breakthrough results for cross-machine predictions with the aid of leadership-class high-performance-computing (HPC) facilities. [11] The state-of-the-art deep learning disruption prediction models based on the Fusion Recurrent Neural Network (FRNN) [9] have been further improved.…”
Section: Introductionmentioning
confidence: 99%
“…Inference at large scale is becoming increasingly demanded, given that for large models the inference would also be distributed [4]. In smaller models, when latency of inference matters in an application, the inference could also be distributed (e.g., real-time prediction of Tokamak disruptions in magnetically-confined thermonuclear plasma experiments [11]). Some of the limitations and bottlenecks of distributed training discussed previously also appear in distributed inference (See lines marked with Y in column I of Table 6).…”
Section: Distributedmentioning
confidence: 99%
“…The goal of such predictions is to enable interventions that mitigate or avoid the detrimental effects of such disruptions on the fusion reactors. The recent state-of-the-art architecture utilizes scalar inputs, including plasma current and internal inductance, as well as 1D electron temperature and density profiles [12,8]. For each time step, the 1D profiles are processed using multichannel classical convolution, followed by concatenation with the scalar signals.…”
Section: Introductionmentioning
confidence: 99%
“…The earliest work uses an long short-term memory (LSTM) architecture to propagate information from the past in order to inform the prediction [12]. A more recent iteration uses temporal convolutional networks (TCN) to again a much larger memory capacity compared to recurrent models [8]. The above summary points out two ways we can insert quantum convolution: replacing spatial convolution on the 1D profiles with quantum convolution and replacing temporal convolution with quantum temporal convolution.…”
Section: Introductionmentioning
confidence: 99%