Neural Architecture Search (NAS) is a promising and rapidly evolving research area. Training a large number of neural networks requires an exceptional amount of computational power, which makes NAS unreachable for those researchers who have limited or no access to high-performance clusters and supercomputers. A few benchmarks with precomputed neural architectures performances have been recently introduced to overcome this problem and ensure more reproducible experiments. However, these benchmarks are only for the computer vision domain and, thus, are built from the image datasets and convolution-derived architectures. In this work, we step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP). Our main contribution is as follows: we have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it; we have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation; finally, we have tested several NAS algorithms to demonstrate how the precomputed results can be utilized. We believe that our results have high potential of usage for both NAS and NLP communities.
A self-consistent model of plasma polarization around an isolated micron-sized dust particle under the action of an external electric field is presented. It is shown that the quasineutral condition is fulfilled and the formed volume charge totally screens the dust particle. The ion focusing and wake formation behind the dust particle are demonstrated for different ion mean free paths and the external electric fields. It is obtained that at low values of the external electric field the trapped ions play the main role in the screening of the dust particle charge. For high external electric fields, the density of trapped ions decreases and the dust particle is screened mainly by the free ions.
In this paper, a self‐consistent numerical model that describes the behavior of plasma around an isolated, highly charged dust particle is presented. Using the developed model, self‐consistent distributions of the space charge density and plasma potential in the presence of an external electric field are obtained. These distributions are thoroughly analysed though Legendre decomposition. For different dust plasma parameters, such as the radius of the dust particle, the amplitude of the external field, and the mean free path of ions, the dipole moment of the ion cloud surrounding the dust particle is calculated. It turns out that the dependencies of the dipole moment on the value of the external electric field obtained for different parameters are reduced to a single curve by simple scaling.
A model of the positive column of a DC glow discharge in argon with dust particles is presented. The model is based on the solution of the non-local Boltzmann equation for the electron energy distribution function (EEDF), the drift-diffusion equations for ions, and the Poisson equation for a self-consistent electric field. The radial distribution of dust particles density in a dust cloud is calculated according to an equilibrium Boltzmann distribution in the electrostatic field and under the action of the ion drag force. It is shown that if in the center of the discharge tube the ion drag force is larger than the electrostatic force, a dust-free region (void) in the central part of the dust cloud is formed. Otherwise (for a smaller ion drag force) the dust cloud is formed around the axis of the discharge tube without a void. In both cases, in the region of the dust cloud the ionization and recombination rates become almost equal to each other, and the radial component of the electric field is expelled from the dust cloud.
Neural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on a part of a dataset, fewer epochs, with fewer channels, etc. In this paper, we propose to improve low-fidelity evaluations of neural architectures by using a knowledge distillation. Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We carry out experiments on CIFAR-100 and ImageNet and study various knowledge distillation methods. We show that training on the small part of a dataset with such a modified loss function leads to a better selection of neural architectures than training with a logistic loss. The proposed low-fidelity evaluations were incorporated into a multi-fidelity search algorithm that outperformed the search based on high-fidelity evaluations only (training on a full dataset).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.