In this paper, we propose a cross layer energy efficient resource allocation and remote radio head (RRH) selection algorithm for heterogeneous traffic in power domainnon-orthogonal multiple access (PD-NOMA) based heterogeneous cloud radio access networks (H-CRANs). The main aim is to maximize the EE of the elastic users subject to the average delay constraint of the streaming users and the constraints, RRH selection, subcarrier, transmit power and successive interference cancellation. The considered optimization problem is non-convex, NP-hard and intractable. To solve this problem, we transform the fractional objective function into a subtractive form. Then, we utilize successive convex approximation approach. Moreover, in order to increase the processing speed, we introduce a framework for accelerating the successive convex approximation for low complexity with the Lagrangian method on graphics processing unit. Furthermore, in order to show the optimality gap of the proposed successive convex approximation approach, we solve the proposed optimization problem by applying an optimal method based on the monotonic optimization. Studying different scenarios show that by using both PD-NOMA technique and H-CRAN, the system energy efficiency is improved.Index Terms-Heterogeneous traffic, PD-NOMA, remote radio head selection, graphics processing unit.(RRHs) where one of the RRHs is a high power node (HPN) and the others are low power nodes (LPNs). Instead of the processing that is distributed at the base stations (BSs) in the HCN, a centralized signal processing is applied in the BBU pool which reduces the manufacturing and operating cost. Moreover, a cooperation between different RRHs is permitted due to the centralized signal processing, thus spectrum efficiency and link reliability are improved. The RRHs compress and forward the received signals from the user to the BBU pool via high bandwidth and low latency fiber links [2]. Therefore, H-CRANs improve the users quality of service (QoS), the spectral efficiency (SE) of the system and increase the network architecture flexibility. Moreover, H-CRANs decrease the power consumption of the system, and PD-NOMA technique improves the system throughput, SE, and energy efficiency (EE) of the fifth generation (5G) cellular communication systems. In order to cover the advantages of H-CRAN and PD-NOMA technique at the same time, we consider PD-NOMA based H-CRAN system.Due to the enormous increase in mobile data traffic and the complexity of the proposed technologies including PD-NOMA and H-CRAN, a high computational processing is needed where the conventional methods can not tackle this issue. Therefore, we seek toward a new processing method which accelerates the processing time. Graphics Processing Unit (GPU), due to the advantage of its massive number of cores and its parallelism directives, handles the works with parallel data [3]- [7]. Accelerating applications and simulations with using GPUs has turned out to be progressively wellknown from 2006 [8].OpenACC is an open GPU d...