This paper considers the design of optimal resource allocation policies in wireless communication systems which are generically modeled as a functional optimization problem with stochastic constraints. These optimization problems have the structure of a learning problem in which the statistical loss appears as a constraint, motivating the development of learning methodologies to attempt their solution. To handle stochastic constraints, training is undertaken in the dual domain. It is shown that this can be done with small loss of optimality when using near-universal learning parameterizations. In particular, since deep neural networks (DNN) are near-universal their use is advocated and explored. DNNs are trained here with a model-free primal-dual method that simultaneously learns a DNN parametrization of the resource allocation policy and optimizes the primal and dual variables. Numerical simulations demonstrate the strong performance of the proposed approach on a number of common wireless resource allocation problems.
Graphons are infinite-dimensional objects that represent the limit of convergent sequences of discrete graphs. This paper derives a theory of Graphon Signal Processing centered on the notions of graphon Fourier transform and linear shift invariant graphon filters. These two objects are graphon counterparts of graph Fourier transforms and graph filters. It is shown that in convergent sequences of graphs and associated graph signals: (i) The graph Fourier transform converges to the graphon Fourier transform when considering graphon bandlimited signals. (ii) The spectral and vertex responses of graph filters converge to the spectral and vertex responses of graphon filters with the same coefficients. These theorems imply that for graphs that belong to certain families -in the sense that they are part of sequences that converge to a certain graphon-graph Fourier analysis and graph filter design have well defined limits. In turn, these facts extend applicability of graph signal processing to graphs with large number of nodes -because we can transfer designs from limit graphons to finite graphs-and dynamic graphs -because we can transfer designs to different graphs drawn from the same graphon.
Parallel combinations of adaptive filters have been effectively used to improve the performance of adaptive algorithms and address well-known trade-offs, such as convergence rate vs. steady-state error. Nevertheless, typical combinations suffer from a convergence stagnation issue due to the fact that the component filters run independently. Solutions to this issue usually involve conditional transfers of coefficients between filters, which although effective, are hard to generalize to combinations with more filters or when there is no clearly faster adaptive filter. In this work, a more natural solution is proposed by cyclically feeding back the combined coefficient vector to all component filters. Besides coping with convergence stagnation, this new topology improves tracking and supervisor stability, and bridges an important conceptual gap between combinations of adaptive filters and variable step size schemes. We analyze the steady-state, tracking, and transient performance of this topology for LMS component filters and supervisors with generic activation functions. Numerical examples are used to illustrate how coefficients feedback can improve the performance of parallel combinations at a small computational overhead.
This work introduces a new data reuse algorithm based on the incremental combination of LMS filters. It is able to outperform the Affine Projection Algorithm (APA) in its standard form, another well-known data reuse adaptive filter. First, the so called true gradient data reuse LMS-sometimes referred to as data reuse LMS-is shown to be a limiting case of the regularized APA. Afterwards, an incremental counterpart of its recursion is inspired by distributed optimization and adaptive networks scenarios. Simulations in different scenarios show the efficiency of the proposed data reuse algorithm, that is able to match and even outperform the APA in the mean-square sense at lower computational complexity.
In this paper, we study the learning of safe policies in the setting of reinforcement learning problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not know the transition probabilities, but we have access to sample trajectories through experience. We define safety as the agent remaining in a desired safe set with high probability during the operation time. We therefore consider a constrained MDP where the constraints are probabilistic. Since there is no straightforward way to optimize the policy with respect to the probabilistic constraint in a reinforcement learning framework, we propose an ergodic relaxation of the problem. The advantages of the proposed relaxation are threefold. (i) The safety guarantees are maintained in the case of episodic tasks and they are kept up to a given time horizon for continuing tasks. (ii) The constrained optimization problem despite its non-convexity has arbitrarily small duality gap if the parametrization of the policy is rich enough. (iii) The gradients of the Lagrangian associated to the safe-learning problem can be easily computed using standard policy gradient results and stochastic approximation tools. Leveraging these advantages, we establish that primal-dual algorithms are able to find policies that are safe and optimal. We test the proposed approach in a navigation task in a continuous domain. The numerical results show that our algorithm is capable of dynamically adapting the policy to the environment and the required safety levels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.