Heterogeneous network is considered as key technologies involved in significant utilization of available unused spectrum. Through the implementation of heterogeneous network energy utilization is achieved through effective spectral efficiency with higher throughput. However, this heterogeneous network environment is subjected to the challenge of latency. To derive the complete potential of heterogeneous networks machine learning algorithms need to be adopted for a dynamic environment. In the case of a practical scenario, it is difficult to reduce the network latency due to the complex network nature. To overcome those limitations, this paper proposed a Q-learning Reinforcement learning (QleaRL) for reducing latency. The proposed QleaRL utilizes Cooperative Q - learning based on consideration of state, action, and reward. Through optimal policy, reinforcement learning is computed based on Q -values. The performance of the proposed QleaRL is evaluated for latency. Simulation of proposed QleaRL is examined in terms of numerical analysis. The performance of the proposed QleaRL exhibits superior performance than the fixed power allocation (FPA) and tabular Q learning.