Mahzad Khoshlessan scite author profile

Mahzad Khoshlessan

5Publications

33Citation Statements Received

236Citation Statements Given

How they've been cited

How they cite others

141

222

Affiliations

Arizona State University, Amirkabir University of Technology

Publications

Order By: Most citations

Parallel Analysis in MDAnalysis using the Dask Parallel Computing Library

Khoshlessan¹,

Paraskevakos²,

Jha³

et al. 2017

View full text Add to dashboard Cite

Abstract-The analysis of biomolecular computer simulations has become a challenge because the amount of output data is now routinely in the terabyte range. We evaluated if this challenge can be met by a parallel map-reduce approach with the Dask parallel computing library for task-graph based computing coupled with our MDAnalysis Python library for the analysis of molecular dynamics (MD) simulations. We performed a representative performance evaluation, taking into account the highly heterogeneous computing environment that researchers typically work in together with the diversity of existing file formats for MD trajectory data. We found that the underlying storage system (solid state drives, parallel file systems, or simple spinning platter disks) can be a deciding performance factor that leads to data ingestion becoming the primary bottleneck in the analysis work flow. However, the choice of the data file format can mitigate the effect of the storage system; in particular, the commonly used Gromacs XTC trajectory format, which is highly compressed, can exhibit strong scaling close to ideal due to trading a decrease in global storage access load against an increase in local per-core CPU-intensive decompression. Scaling was tested on a single node and multiple nodes on national and local supercomputing resources as well as typical workstations. Although very good strong scaling could be achieved for single nodes, good scaling across multiple nodes was hindered by the persistent occurrence of "stragglers", tasks that take much longer than all other tasks, and whose ultimate cause could not be completely ascertained. In summary, we show that, due to the focus on high interoperability in the scientific Python eco system, it is straightforward to implement map-reduce with Dask in MDAnalysis and provide an in-depth analysis of the considerations to obtain good parallel performance on HPC resources.

show abstract

Task-parallel Analysis of Molecular Dynamics Trajectories

Paraskevakos

Luckow

Khoshlessan

et al. 2018

View full text Add to dashboard Cite

Different parallel frameworks for implementing data analysis applications have been proposed by the HPC and Big Data communities. In this paper, we investigate three task-parallel frameworks: Spark, Dask and RADICAL-Pilot with respect to their ability to support data analytics on HPC resources and compare them to MPI. We investigate the data analysis requirements of Molecular Dynamics (MD) simulations which are significant consumers of supercomputing cycles, producing immense amounts of data. A typical large-scale MD simulation of a physical system of O(100k) atoms over µsecs can produce from O(10) GB to O(1000) GBs of data. We propose and evaluate different approaches for parallelization of a representative set of MD trajectory analysis algorithms, in particular the computation of path similarity and leaflet identification. We evaluate Spark, Dask and RADICAL-Pilot with respect to their abstractions and runtime engine capabilities to support these algorithms. We provide a conceptual basis for comparing and understanding different frameworks that enable users to select the optimal system for each application. We also provide a quantitative performance analysis of the different algorithms across the three frameworks.

show abstract

Parallel performance of molecular dynamics trajectory analysis

Khoshlessan

Paraskevakos

Fox

et al. 2020

Concurrency and Computation

View full text Add to dashboard Cite

The performance of biomolecular molecular dynamics (MD) simulations has steadily increased on modern high performance computing (HPC) resources but acceleration of the analysis of the output trajectories has lagged behind so that analyzing simulations is becoming a bottleneck.To close this gap, we studied the performance of parallel trajectory analysis with MPI and the Python MDAnalysis library on three different XSEDE supercomputers where trajectories were read from a Lustre parallel file system. Strong scaling performance was impeded by stragglers, MPI processes that were slower than the typical process. Stragglers were less prevalent for compute-bound workloads, thus pointing to file reading as a crucial bottleneck for scaling. However, a more complicated picture emerged in which both the computation and the ingestion of data exhibited close to ideal strong scaling behavior whereas stragglers were primarily caused by either large MPI communication costs or long times to open the single shared trajectory file. We improved overall strong scaling performance by either subfiling (splitting the trajectory into separate files) or MPI-IO with Parallel HDF5 trajectory files. We obtained near ideal strong scaling on up to 384 cores (16 nodes), thus reducing trajectory analysis times by two orders of magnitude compared to the serial approach.

show abstract

Evaluation of a control-volume based finite-element collocated scheme for the solution of external steady and unsteady incompressible flows at low Reynolds numbers

Khoshlessan¹,

Karimian²,

Daemi³

2013

View full text Add to dashboard Cite

Detailed Numerical Study of Flow and Vortex Dynamics in Staggered Pin-Fin Arrays Within a Channel

Kannan

Khoshlessan

Herrmann

et al. 2016

View full text Add to dashboard Cite

Pin-fin arrays are known to enhance heat transfer from heated surfaces and provide important industrial applications such as increasing internal heat transfer to a turbine blade or solar receiver. Several studies on heat transfer characteristics of various pin-fin arrangements and effects of geometrical parameters on heat transfer have been performed in the past. The present paper aims to address main aspects of fluid flow and heat transfer interactions through a pin-fin array with the help of high-fidelity numerical simulations and focuses on three issues. The first one is to evaluate the effect of three dimensional flow physics such as horseshoe vortices and periodic unsteadiness from vortex shedding on the spatial variation of heat transfer. The second target is to analyze the effect of free end clearance in the case of finite height pin-fin arrays with added flow complexity relative to wall-bounded pin-fin arrays, to provide a comprehensive picture of the flow physics introduced by free ends. The third one is to provide a general guideline for the numerical simulation of flows through pin-fin arrays by comparing simulations on reduced span-wise domains with the full multi-row pin configuration, to elucidate the significance of wall effects. In addition, comparison of the flow characteristics in different stream-wise row locations, is performed to establish the domain length where self-similarity might occur with inflow/outflow conditions. All simulations are conducted for low Mach number incompressible flow with temperature as a passive scalar. The current formulation assumes that variations in temperature have no effect on the fluid motion by choosing appropriate thermal boundary conditions that are still within the realistic parameter range for turbine cooling. In this paper, we perform flow simulations using the Large Eddy Simulation methodology. Two numerical codes, one based on a Finite Volume method and the other based on a Spectral Element approach, are benchmarked with each other and validated versus experiments available in the literature (Ostanek and Thole, 2012).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.