Deep Reinforcement Learning: A Survey

Wang, Xu; Wang, Sen; Liang, Xingxing; Zhao, Debin; Huang, Jincai; Xu, Xin; Dai, Bin; Miao, Qiguang

doi:10.1109/tnnls.2022.3207346

Cited by 115 publications

(51 citation statements)

References 78 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The difference between Q π (s t , a t ) and V π (s t ) is a lower variance alternative to the action-value function known as the advantage function A π (s t , a t ) since it represents how advantageous it is to take action a t as compared with the average performance we would expect from state s t . These quantities are used throughout the RL field, which is conventionally subdivided into three classes of methods: dynamic programming, model free, and model based [31], [32]. Dynamic programming has its origins in optimal control [33] and may be used to compute an optimal policy based on a known MDP.…”

Section: B Reinforcement Learningmentioning

confidence: 99%

“…Finally, in model-based methods, we attempt to learn a model of the MDP, which can then be used for planning or to learn a policy by sampling from the MDP and training with a modelfree approach (e.g., Dyna-based methods [34]). For a more comprehensive review of the RL field, we advise the reader to Arulkumaran et al's [31] or Wang et al's [32] deep RL survey.…”

Section: B Reinforcement Learningmentioning

confidence: 99%

See 1 more Smart Citation

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

Prudencio

Máximo

Colombini

2024

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from pixel observations, sustaining conversations with humans, and controlling robotic agents. However, there is still a wide range of domains inaccessible to RL due to the high cost and danger of interacting with the environment. Offline RL is a paradigm that learns exclusively from static datasets of previously collected interactions, making it feasible to extract policies from large and diverse training datasets. Effective offline RL algorithms have a much wider range of applications than online RL, being particularly appealing for real-world applications, such as education, healthcare, and robotics. In this work, we contribute with a unifying taxonomy to classify offline RL methods. Furthermore, we provide a comprehensive review of the latest algorithmic breakthroughs in the field using a unified notation as well as a review of existing benchmarks' properties and shortcomings. Additionally, we provide a figure that summarizes the performance of each method and class of methods on different dataset properties, equipping researchers with the tools to decide which type of algorithm is best suited for the problem at hand and identify which classes of algorithms look the most promising. Finally, we provide our perspective on open problems and propose future research directions for this rapidly growing field.

show abstract

Section: B Reinforcement Learningmentioning

confidence: 99%

Section: B Reinforcement Learningmentioning

confidence: 99%

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

Prudencio

Máximo

Colombini

2024

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

“…A fundamental condition for the acceptance of the construct of NR is the existence of a pre-given world. Reenacting such an agent/world setting corresponds to a large field of DL called deep reinforcement learning (DRL) (Mnih et al, 2015;Silver et al, 2016;Eppe et al 2022;Wang et al, 2022). DRL trains as ANN, considered as an agent, to select the right actions based on the observations (or states) of an external environment in order to maximize potential reward (Fig.…”

Section: What Is a Representation?mentioning

confidence: 99%

Rejecting Cognitivism: Computational Phenomenology for Deep Learning

Beckmann¹,

Köstner²,

Hipólito³

2023

Preprint

View full text Add to dashboard Cite

We propose a non-representationalist framework for deep learning relying on a novel method: computational phenomenology, a dialogue between the first-person perspective (relying on phenomenology) and the mechanisms of computational models. We thereby reject the modern cognitivist interpretation of deep learning, according to which artificial neural networks encode representations of external entities. This interpretation mainly relies on neuro-representationalism, a position that combines a strong ontological commitment towards scientific theoretical entities and the idea that the brain operates on symbolic representations of these entities. We proceed as follows: after offering a review of cognitivism and neuro-representationalism in the field of deep learning, we first elaborate a phenomenological critique of these positions; we then sketch out computational phenomenology and distinguish it from existing alternatives; finally we apply this new method to deep learning models trained on specific tasks, in order to formulate a conceptual framework of deep-learning, that allows one to think of artificial neural networks' mechanisms in terms of lived experience.

show abstract

“…It estimates how good it is for the agent to be in a particular state or to take a specific action. The value function can improve the policy by finding the action that leads to the maximum value in each state [89]. There are two main types of value functions: state-value function and actionvalue function.…”

Section: E Deep Reinforcement Learning For Bcmentioning

confidence: 99%

The State of AI-Empowered Backscatter Communications: A Comprehensive Survey

Ahmed¹,

Hussain²,

Ali³

et al. 2023

Preprint

View full text Add to dashboard Cite

<p>This paper brings these two technologies together to investigate the current state of AI-powered BC. </p> <p>We begin with an introduction to BC and an overview of the AI algorithms employed in BC. Then, we delve into the recent advances in AI-based BC, covering key areas such as backscatter signal detection, channel estimation, and jammer control to ensure security, mitigate interference, and improve throughput and latency. We also explore the exciting frontiers of AI in BC using B5G/6G technologies, including backscatter-assisted relay and cognitive communication networks, backscatter-assisted MEC networks, and BC with RIS, UAV, and vehicular networks. Finally, we highlight the challenges and present new research opportunities in AI-powered BC. This survey provides a comprehensive overview of the potential of AI-powered BC and its insightful impact on the future of IoT.</p>

show abstract

Deep Reinforcement Learning: A Survey

Cited by 115 publications

References 78 publications

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

Rejecting Cognitivism: Computational Phenomenology for Deep Learning

The State of AI-Empowered Backscatter Communications: A Comprehensive Survey

Contact Info

Product

Resources

About