The uplink of a cell-free massive multiple-input multiple-output with maximum-ratio combining (MRC) and zero-forcing (ZF) schemes are investigated. A power allocation optimization problem is considered, where two conflicting metrics, namely the sum rate and fairness, are jointly optimized. As there is no closed-form expression for the achievable rate in terms of the large scale-fading (LSF) components, the sum rate fairness trade-off optimization problem cannot be solved by using known convex optimization methods. To alleviate this problem, we propose two new approaches. For the first approach, a use-and-then-forget scheme is utilized to derive a closed-form expression for the achievable rate. Then, the fairness optimization problem is iteratively solved through the proposed sequential convex approximation (SCA) scheme. For the second approach, we exploit LSF coefficients as inputs of a twin delayed deep deterministic policy gradient (TD3), which efficiently solves the non-convex sum rate fairness trade-off optimization problem. Next, the complexity and convergence properties of the proposed schemes are analyzed. Numerical results demonstrate the superiority of the proposed approaches over conventional power control algorithms in terms of the sum rate and minimum user rate for both the ZF and MRC receivers. Moreover, the proposed TD3-based power control achieves better performance than the proposed SCA-based approach as well as the fractional power scheme. Index Terms-Cell-free massive MIMO, deep reinforcement learning, fairness, power control, sequential convex approximation.
I. IntroductionThe vast development of mobile communication networks and the number of supported devices have imposed increasing demands for much higher data-rate mobile communication. Cell-free massive multiple-input multipleoutput (mMIMO) is a key enabling wireless network technology as it greatly increases coverage probability and the data rate [2]. In cell-free mMIMO, a large number of