Model-based plasma scenario development lies at the heart of the design and
operation of future fusion powerplants. Including turbulent transport in integrated
models is essential for delivering a successful roadmap towards operation of ITER
and the design of DEMO-class devices. Given the highly iterative nature of integrated
models, fast machine-learning-based surrogates of turbulent transport are fundamental
to fulfil the pressing need for faster simulations opening up pulse design, optimization,
and flight simulator applications. A significant bottleneck is the generation of suitably
large training datasets covering a large volume in parameter space, which can be
prohibitively expensive to obtain for higher fidelity codes.
In this work, we propose ADEPT (Active Deep Ensembles for Plasma Turbulence),
a physics-informed, two-stage Active Learning strategy to ease this challenge. Active
Learning queries a given model by means of an acquisition function that identifies
regions where additional data would improve the surrogate model. We provide a
benchmark study using available data from the literature for the QuaLiKiz quasilinear
model. We demonstrate quantitatively that the physics-informed nature of the
proposed workflow reduces the need to perform simulations in stable regions of the
parameter space, resulting in significantly improved data efficiency compared to non-
physics informed approaches which consider a regression problem over the whole
domain. We show an up to a factor of 20 reduction in training dataset size needed to
achieve the same performance as random sampling. We then validate the surrogates on
multichannel integrated modelling of ITG-dominated JET scenarios and demonstrate
that they recover the performance of QuaLiKiz to better than 10%. This matches
the performance obtained in previous work, but with two orders of magnitude fewer
training data points.