Neural Architecture Search (NAS) is a promising and rapidly evolving research area. Training a large number of neural networks requires an exceptional amount of computational power, which makes NAS unreachable for those researchers who have limited or no access to high-performance clusters and supercomputers. A few benchmarks with precomputed neural architectures performances have been recently introduced to overcome this problem and ensure reproducible experiments. However, these benchmarks are only for the computer vision domain and, thus, are built from the image datasets and convolution-derived architectures. In this work, we step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP). Our main contribution is as follows: we have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it; we have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation; finally, we have tested several NAS algorithms to demonstrate how the precomputed results can be utilized. We consider that the benchmark will provide more reliable empirical findings in the community and stimulate progress in developing new NAS methods well suited for recurrent architectures.INDEX TERMS Benchmark, natural language processing, neural architecture search, recurrent neural network.
Neural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on a part of a dataset, fewer epochs, with fewer channels, etc. In this paper, we propose a Bayesian multi-fidelity method for neural architecture search: MF-KD. The method relies on a new approach to low-fidelity evaluations of neural architectures by training for a few epochs using a knowledge distillation. Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We carry out experiments on CIFAR-10, CIFAR-100, and ImageNet-16-120. We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss. The proposed method outperforms several state-of-the-art baselines.
Query Optimization is considered to be one of the most important challenges in database management. Existing built-in query optimizers are very complex and rely on various approximations and hand-picked rules. The rise of deep learning and deep reinforcement learning has aided many scientific and industrial fields, providing an opportunity to develop a learnable query optimizer. In this paper, we analyse and improve the state-of-the-art learned query optimizer, Neo for the JOB benchmark on two database systems: PostgreSQL and Huawei GaussDB. We describe our methods, based on combination of Neo, Tree-Transformers, auxiliary tasks, reward weighting. Combinations of these methods improve latency of the found query execution plans. We also conduct a thorough analysis of the resulting execution plans and devise a set of decision-based rules to indicate the cases when the learned optimizer will outperform the built-in one. We also provide a source code for the proposed methods and experiments. Finally, we provide possible directions for further improvement in this field.
The problem of increasing the accuracy of determining the orientation of a spacecraft (SC) using a system of star trackers (ST) is considered. Methods are proposed that make it possible to use a joint field of view and refine the relative position of ST to improve the accuracy of orientation determination. The use of several star trackers leads to an increase in the angle between the directions to the stars into the joint field of view, which makes it possible to reduce the condition number of the matrices used in calculating the orientation parameters. The paper develops a combinatorial method for interval estimation of the SC orientation with an arbitrary number of star trackers. To calculate the ST orientation, a linear problem of interval estimation of the orthogonal orientation matrix for a sufficiently large number of stars is solved. The orientation quaternion is determined under the condition that the corresponding orientation matrix belongs to the obtained interval estimates. The case is considered when the a priori estimate of the mutual binding of star trackers can have an error comparable to or greater than the error in measuring the angular coordinates of stars. With inaccurately specified matrices of the mutual orientation of the star trackers, the errors in the mutual orientations of the STs are added to the errors of measuring the directions to the stars, which leads to an expansion of the uncertainty intervals of the right-hand sides of the system of linear algebraic equations used to determine the orientation parameters. A method is proposed for solving the problem of refining the mutual reference of the internal coordinate systems of a pair of ST as an independent task, after which the main problem of increasing the accuracy of spacecraft orientation is solved. The developed method and algorithms for solving such a complex problem are based on interval estimates of orthogonal orientation matrices. For additional narrowing of the intervals, the property of orthogonality of orientation matrices is used. The numerical simulation carried out made it possible to evaluate the advantages and disadvantages of each of the proposed methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.