In this paper, optimal filter design for generalized frequency-division
multiplexing (GFDM) is considered under two design criteria: rate maximization
and out-of-band (OOB) emission minimization. First, the problem of GFDM filter
optimization for rate maximization is formulated by expressing the transmission
rate of GFDM as a function of GFDM filter coefficients. It is shown that
Dirichlet filters are rate-optimal in additive white Gaussian noise (AWGN)
channels with no carrier frequency offset (CFO) under linear zero-forcing (ZF)
or minimum mean-square error (MMSE) receivers, but in general channels
perturbed by CFO a properly designed nontrivial GFDM filter can yield better
performance than Dirichlet filters by adjusting the subcarrier waveform to cope
with the channel-induced CFO. Next, the problem of GFDM filter design for OOB
emission minimization is formulated by expressing the power spectral density
(PSD) of the GFDM transmit signal as a function of GFDM filter coefficients,
and it is shown that the OOB emission can be reduced significantly by designing
the GFDM filter properly. Finally, joint design of GFDM filter and window for
the two design criteria is considered.Comment: 13 pages, 7 figures, submitted to IEEE Transactions on Signal
Processin
Policy entropy regularization is commonly used for better exploration in deep reinforcement learning (RL). However, policy entropy regularization is sampleinefficient in off-policy learning since it does not take the distribution of previous samples stored in the replay buffer into account. In order to take advantage of the previous sample distribution from the replay buffer for sample-efficient exploration, we propose sample-aware entropy regularization which maximizes the entropy of weighted sum of the policy action distribution and the sample action distribution from the replay buffer. We formulate the problem of sample-aware entropy regularized policy iteration, prove its convergence, and provide a practical algorithm named diversity actor-critic (DAC) which is a generalization of soft actor-critic (SAC). Numerical results show that DAC outperforms SAC and other state-of-the-art RL algorithms.Preprint. Under review.
In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the maximum entropy RL framework in modelfree sample-based learning. Whereas the maximum entropy RL framework guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms.Preprint. Under review.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.