2021
DOI: 10.1016/j.cogsys.2020.08.012
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Deep Q-Network from imperfect demonstrations in Minecraft

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
11
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 2 publications
(2 reference statements)
0
11
0
Order By: Relevance
“…However, the application of emotional model was limited because it adopted symbolic cognitive state and decision-making behavior. In literature [ 22 ], the author proposed an improved emotional interaction model, but it still did not take into account the influence of individual personality characteristics on the emotional model, so it was still limited when it was used to solve decision-making problems. Khare and Bajaj [ 23 ] put forward a multimodal emotion recognition model based on visual signals and physiological signals, which combines facial expression features and ECG features in series at feature level to form multimodal features.…”
Section: Related Workmentioning
confidence: 99%
“…However, the application of emotional model was limited because it adopted symbolic cognitive state and decision-making behavior. In literature [ 22 ], the author proposed an improved emotional interaction model, but it still did not take into account the influence of individual personality characteristics on the emotional model, so it was still limited when it was used to solve decision-making problems. Khare and Bajaj [ 23 ] put forward a multimodal emotion recognition model based on visual signals and physiological signals, which combines facial expression features and ECG features in series at feature level to form multimodal features.…”
Section: Related Workmentioning
confidence: 99%
“…There are lots of relevant studies. For example, HDQfD [31] utilizes the hierarchical structure of expert trajectories, presenting a structured taskdependent replay buffer and an adaptive prioritizing technique to gradually erase poor-quality expert data from the buffer. HDQfD won the first place in 2019.…”
Section: Related Workmentioning
confidence: 99%
“…Otherwise, the agent would waste training time by constantly switching between moving forward and backward, effectively jittering in place rather than exploring the environment. Removed actions maybe set to "always on", which was a popular transformation in the Minecraft MineRL competition, where always executing "attack" helped the agents to learn gathering resources [10], [27], [28].…”
Section: Action Space Shaping In Video Game Environmentsmentioning
confidence: 99%
“…These actions are often discretized, either by splitting them into a set of bins, or by defining three discrete choices: negative, zero and positive. This is especially common with camera rotation, where agents can only choose to turn the camera left/right and up/down at a fixed rate per step [10], [27], [28], [36]. A downside is that this turning rate is a hyper-parameter, which requires tuning.…”
Section: Action Space Shaping In Video Game Environmentsmentioning
confidence: 99%
See 1 more Smart Citation