2020
DOI: 10.1155/2020/1596385
|View full text |Cite
|
Sign up to set email alerts
|

Deep Q-Network with Predictive State Models in Partially Observable Domains

Abstract: While deep reinforcement learning (DRL) has achieved great success in some large domains, most of the related algorithms assume that the state of the underlying system is fully observable. However, many real-world problems are actually partially observable. For systems with continuous observation, most of the related algorithms, e.g., the deep Q-network (DQN) and deep recurrent Q-network (DRQN), use history observations to represent states; however, they often make computation-expensive and ignore the informat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…To put it another way, To estimate the action value, DQN uses a neural network. The most recent four frames of data are fed to the first layer of DQNs for determining current states which are transformed into vectors of action values in connected layers [39]. Rewards of DQNs are shown in Figure 3.…”
Section: Dqnsmentioning
confidence: 99%
See 1 more Smart Citation
“…To put it another way, To estimate the action value, DQN uses a neural network. The most recent four frames of data are fed to the first layer of DQNs for determining current states which are transformed into vectors of action values in connected layers [39]. Rewards of DQNs are shown in Figure 3.…”
Section: Dqnsmentioning
confidence: 99%
“…The average reward achieved by DQN is 13.0. DQN minimizes a differentiable loss function L(θ) [39] by adjusting the network weights θ to improve the action value function.…”
Section: Dqnsmentioning
confidence: 99%