To support massive number of parallel thread contexts, Graphics Processing Units (GPUs) use a huge register file, which is responsible for a large fraction of GPU's total power and area. The conventional belief is that a large register file is inevitable for accommodating more parallel thread contexts, and technology scaling makes it feasible to incorporate ever increasing size of register file. In this paper, we demonstrate that the register file size need not be large to accommodate more threads context. We first characterize the useful lifetime of a register and show that register lifetimes vary drastically across various registers that are allocated to a kernel. While some registers are alive for the entire duration of the kernel execution, some registers have a short lifespan. We propose GPU register file virtualization that allows multiple warps to share physical registers. Since warps may be scheduled for execution at different points in time, we propose to proactively release dead registers from one warp and re-allocate them to a different warp that may occur later in time, thereby reducing the needless demand for physical registers. By using register virtualization, we shrink the architected register space to a smaller physical register space. By under-provisioning the physical register file to be smaller than the architected register file we reduce dynamic and static power consumption. We then develop a new register throttling mechanism to run applications that exceed the size of the under-provisioned register file without any deadlock. Our evaluation shows that even after halving the architected register file size using our proposed GPU register file virtualization applications run successfully with negligible performance overhead.
Classical computing plays a critical role in the advancement of quantum frontiers in the NISQ era. In this spirit, this work uses classical simulation to bootstrap Variational Quantum Algorithms (VQAs). VQAs rely upon the iterative optimization of a parameterized unitary circuit (ansatz) with respect to an objective function. Since quantum machines are noisy and expensive resources, it is imperative to classically choose the VQA ansatz initial parameters to be as close to optimal as possible to improve VQA accuracy and accelerate their convergence on today's devices.This work tackles the problem of finding a good ansatz initialization, by proposing CAFQA, a Clifford Ansatz For Quantum Accuracy. The CAFQA ansatz is a hardware-efficient circuit built with only Clifford gates. In this ansatz, the parameters for the tunable gates are chosen by searching efficiently through the Clifford parameter space via classical simulation. The resulting initial states always equal or outperform traditional classical initialization (e.g., Hartree-Fock), and enable high-accuracy VQA estimations. CAFQA is well-suited to classical computation because: a) Clifford-only quantum circuits can be exactly simulated classically in polynomial time, and b) the discrete Clifford space is searched efficiently via Bayesian Optimization.For the Variational Quantum Eigensolver (VQE) task of molecular ground state energy estimation (up to 18 qubits), CAFQA's Clifford Ansatz achieves a mean accuracy of nearly 99% and recovers as much as 99.99% of the molecular correlation energy that is lost in Hartree-Fock initialization. CAFQA achieves mean accuracy
Quantum information processing holds great promise for pushing beyond the current frontiers in computing. Specifically, quantum computation promises to accelerate the solving of certain problems, and there are many opportunities for innovation based on applications in chemistry, engineering, and finance. To harness the full potential of quantum computing, however, we must not only place emphasis on manufacturing better qubits, advancing our algorithms, and developing quantum software. To scale devices to the fault tolerant regime, we must refine device-level quantum control.On May 17-18, 2021, the Chicago Quantum Exchange (CQE) partnered with IBM Quantum and Super.tech to host the Pulselevel Quantum Control Workshop. At the workshop, representatives from academia, national labs, and industry addressed the importance of fine-tuning quantum processing at the physical layer. The purpose of this report is to summarize the topics of this meeting for the quantum community at large.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.