2022
DOI: 10.48550/arxiv.2206.00152
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Human-AI Shared Control via Policy Dissection

Abstract: Human-AI shared control allows human to interact and collaborate with AI to accomplish control tasks in complex environments. Previous Reinforcement Learning (RL) methods attempt the goal-conditioned design to achieve human-controllable policies at the cost of redesigning the reward function and training paradigm. Inspired by the neuroscience approach to investigate the motor cortex in primates, we develop a simple yet effective frequency-based approach called Policy Dissection to align the intermediate repres… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 59 publications
0
1
0
Order By: Relevance
“…Since it is challenging in general to learn a single policy with RL to perform various tasks [28], many prior works focus on learning a single-goal policy [6,42,49,74] for legged robots, such as just forward walking at a constant speed [16,31,69]. There have been efforts to obtain more versatile policies, such as walking at different velocities using different gaits, while following different commands [17,18,35,54], which requires more extensive tuning due to the lack of a gait prior. Providing the robot with different reference motions for different goals can be helpful, but requires additional parameterization of the reference motions (e.g., a gait library) [3,24,27,37], policy distillation [70], or a motion prior [15,50,67].…”
Section: Related Workmentioning
confidence: 99%
“…Since it is challenging in general to learn a single policy with RL to perform various tasks [28], many prior works focus on learning a single-goal policy [6,42,49,74] for legged robots, such as just forward walking at a constant speed [16,31,69]. There have been efforts to obtain more versatile policies, such as walking at different velocities using different gaits, while following different commands [17,18,35,54], which requires more extensive tuning due to the lack of a gait prior. Providing the robot with different reference motions for different goals can be helpful, but requires additional parameterization of the reference motions (e.g., a gait library) [3,24,27,37], policy distillation [70], or a motion prior [15,50,67].…”
Section: Related Workmentioning
confidence: 99%