Deep Q-Network with Predictive State Models in Partially Observable Domains

Yu, Danning; Ni, Kun; Liu, Yunlong

doi:10.1155/2020/1596385

Cited by 2 publications

(2 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To put it another way, To estimate the action value, DQN uses a neural network. The most recent four frames of data are fed to the first layer of DQNs for determining current states which are transformed into vectors of action values in connected layers [39]. Rewards of DQNs are shown in Figure 3.…”

Section: Dqnsmentioning

confidence: 99%

See 1 more Smart Citation

Autonomous Taxi Driving Environment Using Reinforcement Learning Algorithms

Dar¹,

Palanivel²,

Geetha³

2022

IJMECS

View full text Add to dashboard Cite

Autonomous driving is predicted to alter the transportation industry in the near future. For decades, carmakers, researchers, and administrators have already been working in this sector, with tremendous development. Nevertheless, there are still many uncertainties and obstacles to solve, not only in terms of technical technology, as well as in terms of human consciousness, culture, and present traffic infrastructure. With respect to technological challenges, precise route identification, avoiding the improper location, time delay, erroneous drop-off, unsafe path, and automated navigation in the environment are only a few. RL (Reinforcement Learning) has evolved into a robust learning model which can learn about complications in high dimensional settings, owing to the advent of deep representation learning. Environment learning has been shown to reduce the required time delay, reduce cost of travel, and improve the performance of the agent by discovering a successful drop-off. The major goal is to ensure that an autonomous vehicle driving can reach passengers, pick them up, and transport them to drop-off points as quickly as possible. For performing this task, RL methods like DQNs (Deep Q Networks), Q-LNs (Q-Learning networks) , SARSAs (state action reward state actions), and ConvDQNs (convolution DQNs) are proposed for driving Taxis autonomously. RL agent's decisions are based on MDPs (Markov Decision Processes). The agent has effectively learnt the closest path, safety, and lower cost, gradually obtaining the capacity to travel bigger areas of the successful drop-off without negative incentive for reaching the target using these RL approaches. This scenario was chosen based on a set of requirements for simulating autonomous vehicles using RL algorithms. Results indicate that ConvDQNs are capable of successfully controlling cars in simulation environments than other RL methods. ConvDQNs are a combinations of CNNs (Convolution Neural Networks) and DQNs. These networks show better results than other methods as their combining of procedures gives improved results. Results indicate that ConvDQNs are capable of successfully controlling a car to navigate around a Taxi-v2 environment than the existing RL methods.

show abstract

Section: Dqnsmentioning

confidence: 99%

“…The average reward achieved by DQN is 13.0. DQN minimizes a differentiable loss function L(θ) [39] by adjusting the network weights θ to improve the action value function.…”

Section: Dqnsmentioning

confidence: 99%