The present paper introduces the XcalableACC (XACC) programming model, which is a hybrid model of the XcalableMP (XMP) Partitioned Global Address Space (PGAS) language and OpenACC. XACC defines directives that enable programmers to mix XMP and OpenACC directives in order to develop applications that can use accelerator clusters with ease. Moreover, in order to improve the performance of stencil applications, the Omni XACC compiler provides functions that can transfer a halo region on accelerator memory via Tightly Coupled Accelerators (TCA), which is a proprietary network for transferring data directly among accelerators. In the present paper, we evaluate the productivity and the performance of XACC through implementations of the HIMENO Benchmark. The results show that thanks to the productivity improvements, XACC requires less than half the source lines of code compare to a combination of Message Passing Interface (MPI) and OpenACC, which is commonly used together as a typical programming model. As a result of these performance improvements, XACC using TCA achieved up to 2.7 times faster performance than could be obtained via the combination of OpenACC and MPI programming model using GPUDirect RDMA over InfiniBand.
To improve productivity for developing parallel applications on high performance computing systems, the XcalableMP PGAS language has been proposed. XcalableMP supports both a typical parallelization under the ''global-view memory model'' which uses directives and a flexible parallelization under the ''local-view memory model'' which uses coarray features. The goal of the present paper is to clarify XcalableMP's productivity and performance. To do so, we implement and evaluate the high performance computing challenge benchmark, namely, EP STREAM Triad, High Performance Linpack, Global fast Fourier transform, and RandomAccess on the K computer using up to 16,384 compute nodes and a generic cluster system using up to 128 compute nodes. We found that we could more easily implement the benchmarks using XcalableMP rather than using MPI. Moreover, most of the performance results using XcalableMP were almost the same as those using MPI.
We succeeded in getting 14.9 TFLOPS performance when running a plasma simulation code IMPACT-3D parallelized with High Performance Fortran on 512 nodes of the Earth Simulator. The theoretical peak performance of the 512 nodes is 32 TFLOPS, which means 45% of the peak performance was obtained with HPF. IMPACT-3D is an implosion analysis code using TVD scheme, which performs three-dimensional compressible and inviscid Eulerian fluid computation with the explicit 5-point stencil scheme for spatial differentiation and the fractional time step for time integration. The mesh size is 2048x2048x4096, and the third dimension was distributed for the parallelization. The HPF system used in the evaluation is HPF/ES, developed for the Earth Simulator by enhancing NEC HPF/SX V2 mainly in communication scalability. Shift communications were manually tuned to get best performance by using HPF/JA extensions, which was designed to give the users more control over sophisticated parallelization and communication optimizations.
In this paper we report the generation of wavelength-multiplexed polarization-entangled photon pairs in the 1.5-μm communication wavelength band by using cascaded optical second nonlinearities (sum-frequency generation and subsequent spontaneous parametric down-conversion, c-SFG/SPDC) in a periodically poled LiNbO(3) ridge waveguide device. The c-SFG/SPDC method makes it possible to fully use the broad spectral bandwidth of SPDC in nearly frequency-degenerate conditions, and can provide more than 50 pairs of wavelength channels for the entangled photon pairs in the 1.5-μm wavelength band, using only standard optical resources in the telecom field. Visibilities higher than 98% were clearly observed in two-photon interference fringes for all the wavelength channels under investigation (eight pairs). We further performed a detailed experimental investigation of the cross-talk characteristics and the impact of detuning the pump wavelengths.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.