Modern deep neural network training is typically based on mini-batch stochastic gradient optimization. While the use of large mini-batches increases the available computational parallelism, small batch training has been shown to provide improved generalization performance and allows a significantly smaller memory footprint, which might also be exploited to improve machine throughput. In this paper, we review common assumptions on learning rate scaling and training duration, as a basis for an experimental comparison of test performance for different mini-batch sizes. We adopt a learning rate that corresponds to a constant average weight update per gradient calculation (i.e., per unit cost of computation), and point out that this results in a variance of the weight updates that increases linearly with the mini-batch size m. The collected experimental results for the CIFAR-10, CIFAR-100 and ImageNet datasets show that increasing the mini-batch size progressively reduces the range of learning rates that provide stable convergence and acceptable test performance. On the other hand, small mini-batch sizes provide more up-to-date gradient calculations, which yields more stable and reliable training. The best performance has been consistently obtained for mini-batch sizes between m = 2 and m = 32, which contrasts with recent work advocating the use of mini-batch sizes in the thousands.
General rightsThis document is made available in accordance with publisher policies. Please cite only the published version using the reference above. Department of Aerospace Engineering, University of BristolA comprehensive review of aerofoil shape parameterisation methods that can be used for aerodynamic shape optimisation is presented. Seven parameterisation methods are considered for a range of design variables: CSTs; B-Splines; Hicks-Henne bump functions; a Radial Basis function (RBF) domain element approach; Bèzier surfaces; a singular value decomposition modal extraction method (SVD); and the PARSEC method. Due to the large range of variables involved the most effective way to implement each method is first investigated. Their performance is then analysed by considering the geometric shape recovery of over 2000 aerofoils using a range of design variables, testing the efficiency of design space coverage with respect to a given tolerance. It is shown that, for all the methods, between 20 and 25 design variables are needed to cover the full design space to within a geometric tolerance with the SVD method doing this most efficiently. A set transonic aerofoil case studies are also presented with geometric error and convergence of the resulting aerodynamic properties explored. These results show a strong relationship between geometric error and aerodynamic convergence and demonstrate that between 38 and 66 design variables may be needed to ensure aerodynamic convergence to within one drag and one lift count.
This paper presents a review of aerofoil shape parameterisation methods that can be used for aerodynamic shape optimisation. Six parameterisation methods are considered for a range in design variables: Class function/Shape function Transformations (CST); B-splines; Hicks-Henne bump functions; a domain element approach using Radial Basis functions (RBF); Bèzier surfaces; and a singular value decomposition modal extraction method (SVD); plus the PARSEC method. The performance of each method is analysed by considering geometric shape recovery on over 1000 aerofoils using a range of design variables, testing the efficiency of design space coverage. A more in-depth analysis is then presented for three aerofoils, NACA4412, RAE2822 and ONERA M6 (D section), with geometric error and convergence of the resulting aerodynamic properties presented. In the large scale test it is shown that, for all the methods, a large number of design variables are needed to achieve significant design space coverage. For example at least 25 design variables are needed to cover 80% of the design space regardless of the method used; this is often higher than is desired for two-dimensional studies, suggesting that further work may be required to reduce the number of design variables needed.
Subdivision curves are defined as the limit of a recursive application of a subdivision rule to an initial set of control points. This intrinsically provides a hierarchical set of control polygons that can be used to provide surface control at varying levels of fidelity. This work presents a shape parameterisation method based on this principle and investigates its application to aerodynamic optimisation. The subdivision curves are used to construct a multi-level aerofoil parameterisation that allows an optimisation to be initialised with a small number of design variables, and then be periodically increased in resolution throughout. This brings the benefits of a low fidelity optimisation (high convergence rate, increased robustness, low cost finite-difference gradients) while still allowing the final results to be from a high-dimensional design space. In this work the multi-level subdivision parameterisation is tested on a variety of optimisation problems and compared to a control group of single-level subdivision schemes. For all the optimisation cases the multi-level schemes provided robust and reliable results in contrast to the single-level methods that often experienced difficulties with large numbers of design variables. As a result of this the multi-level methods exploited the high-dimensional design spaces better and consequently produced better overall results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.