Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it provides an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several common benchmarks.
The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher's flexibility to study different machine learning algorithms, forcing them to either use a less desirable network architecture or parallelize the processing across multiple GPUs. We propose a runtime memory manager that virtualizes the memory usage of DNNs such that both GPU and CPU memory can simultaneously be utilized for training larger DNNs. Our virtualized DNN (vDNN) reduces the average GPU memory usage of AlexNet by up to 89%, OverFeat by 91%, and GoogLeNet by 95%, a significant reduction in memory requirements of DNNs. Similar experiments on VGG-16, one of the deepest and memory hungry DNNs to date, demonstrate the memory-efficiency of our proposal. vDNN enables VGG-16 with batch size 256 (requiring 28 GB of memory) to be trained on a single NVIDIA Titan X GPU card containing 12 GB of memory, with 18% performance loss compared to a hypothetical, oracular GPU with enough memory to hold the entire DNN.
A hybrid Lagrangian-Eulerian approach is used to examine the properties of water clusters formed in neon-water vapor mixtures expanding through microscale conical nozzles. Experimental size distributions were reliably determined by the sodium doping technique in a molecular beam machine. The comparison of computed size distributions and experimental data shows satisfactory agreement, especially for (H2O)n clusters with n larger than 50. Thus validated simulations provide size selected cluster temperature profiles in and outside the nozzle. This information is used for an in-depth analysis of the crystallization and water cluster aggregation dynamics of recently reported supersonic jet expansion experiments.
Exact relationship is developed that connects the vibrational relaxation number, ZvDSMC, used in the direct simulation Monte Carlo method and that employed in continuum simulations. An approximate expression for ZvDSMC is also derived that is cost-effective and applicable when translational temperature is larger than vibrational temperature.
Radiometric force on a 0.12 m circular vane is studied experimentally and numerically over a wide range of pressures that cover the flow regimes from near free molecular to near continuum. In the experiment, the vane is resistively heated to about 419 K on one side and 394 K on the other side, and immersed in a rarefied argon gas. The radiometric force is then measured on a nano-Newton thrust stand in a 3 m vacuum chamber and compared with the present numerical predictions and analytical predictions proposed by various authors. The computational modelling is conducted with a kinetic approach based on the solution of ellipsoidal statistical Bhatnagar–Gross–Krook (ES-BGK) equation. Numerical modelling showed the importance of regions with elevated pressure observed near the edges of the vane for the radiometric force production. A simple empirical expression is proposed for the radiometric force as a function of pressure that is found to be in good agreement with the experimental data. The shear force on the lateral side of the vane was found to decrease the total radiometric force.
Gas and ion transport in the capillary-skimmer subatmospheric interface of a mass spectrometer, which is typically utilized to separate unevaporated micro-droplets from ions, was studied numerically using a two-step approach spanning multiple gas dynamic regimes. The gas flow in the heated capillary and in the interface
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.