2020
DOI: 10.3390/app10124236
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid NOMA/OMA-Based Dynamic Power Allocation Scheme Using Deep Reinforcement Learning in 5G Networks

Abstract: Non-orthogonal multiple access (NOMA) is considered a potential technique in fifth-generation (5G). Nevertheless, it is relatively complex when applying NOMA to a massive access scenario. Thus, in this paper, a hybrid NOMA/OMA scheme is considered for uplink wireless transmission systems where multiple cognitive users (CUs) can simultaneously transmit their data to a cognitive base station (CBS). We adopt a user-pairing algorithm in which the CUs are grouped into multiple pairs, and each group is assigned to a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 43 publications
0
12
0
Order By: Relevance
“…RL is built between the base station (BS) and the user based on Contract Theory in heterogeneous uplink NOMA and imperfect CSI [ 17 ]. The actor–critic algorithm is used to control downlink NOMA and maximizes the sum of the user’s rate [ 18 ]. Yet, the actor–critic network has the problem of convergence.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…RL is built between the base station (BS) and the user based on Contract Theory in heterogeneous uplink NOMA and imperfect CSI [ 17 ]. The actor–critic algorithm is used to control downlink NOMA and maximizes the sum of the user’s rate [ 18 ]. Yet, the actor–critic network has the problem of convergence.…”
Section: Related Workmentioning
confidence: 99%
“…Due to the sigmoid function of the action network, the output power will not exceed the maximum power constraint, but the minimum power constraint derived from QoS is not guaranteed. The literature [ 17 , 18 , 19 ] adopt a stepwise return—that is, there is a return when the constraint conditions are met, and the return is 0 or a constant if the constraint is not met. However, through simulation verification, this setting is difficult to converge, so the penalty function setting is used to make the reward function more continuous and easy to converge.…”
Section: Algorithmmentioning
confidence: 99%
“…Many theoretical analyses and experimentation have proved that NOMA can achieve a higher sum rate than orthogonal multiple access (OMA) adopted in the fourth generation (4G) wireless networks [7][8][9][10][11][12]. In NOMA, multiple users with different channel gains can be multiplexed together in the same subchannel and decoded at the receivers through successive interference cancellation (SIC) techniques [13][14][15][16][17][18]. In this mechanism, the power domain is exploited to simultaneously serve multiple users at different power levels, whereby spectrum efficiency can be significantly improved.…”
Section: Introductionmentioning
confidence: 99%
“…By employing the actor-critic RL, Zhang et al [21] proposed a dynamic power allocation scheme. Zhang et al [22] and Giang et al [23] used DRL to obtain suboptimal solutions to the power allocation of UL MC-NOMA systems. He et al [24] used a DRL framework solve to the joint power allocation and channel assignment problem in a perfect two-user NOMA system.…”
Section: Introductionmentioning
confidence: 99%