2019 International Conference on Robotics and Automation (ICRA) 2019
DOI: 10.1109/icra.2019.8794025
|View full text |Cite
|
Sign up to set email alerts
|

Uncertainty-Aware Data Aggregation for Deep Imitation Learning

Abstract: Estimating statistical uncertainties allows autonomous agents to communicate their confidence during task execution and is important for applications in safety-critical domains such as autonomous driving. In this work, we present the uncertainty-aware imitation learning (UAIL) algorithm for improving end-to-end control systems via data aggregation. UAIL applies Monte Carlo Dropout to estimate uncertainty in the control output of end-to-end systems, using states where it is uncertain to selectively acquire new … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 20 publications
(24 citation statements)
references
References 22 publications
0
24
0
Order By: Relevance
“…It is desired that the actual system output will achieve the response of a typical second-order system at a damping ratio 𝜉 = 0.707, the reward function is defined as follows: where 𝑠 (𝑘) denotes the 𝑘𝑡ℎ tracking error of the actual system, 𝑒(𝑘) denotes the 𝑘𝑡ℎ data of the ideal dataset 𝐸𝑟, and 𝜌 serves to adjust the convergence rate of the algorithm. From equation (10), it can be seen that when the actual position tracking error is closer to the ideal tracking error, the reward value 𝑟 will keep approaching 1, otherwise the reward value 𝑟 is close to 0. With the RL method, coach is trained and network weight is updated.…”
Section: Stage ⅱ Expert Model Evolution Through the Training Of The F...mentioning
confidence: 99%
See 1 more Smart Citation
“…It is desired that the actual system output will achieve the response of a typical second-order system at a damping ratio 𝜉 = 0.707, the reward function is defined as follows: where 𝑠 (𝑘) denotes the 𝑘𝑡ℎ tracking error of the actual system, 𝑒(𝑘) denotes the 𝑘𝑡ℎ data of the ideal dataset 𝐸𝑟, and 𝜌 serves to adjust the convergence rate of the algorithm. From equation (10), it can be seen that when the actual position tracking error is closer to the ideal tracking error, the reward value 𝑟 will keep approaching 1, otherwise the reward value 𝑟 is close to 0. With the RL method, coach is trained and network weight is updated.…”
Section: Stage ⅱ Expert Model Evolution Through the Training Of The F...mentioning
confidence: 99%
“…Second, as traditional training methods for machine learning-based intelligent controllers are generally used separately, different methods need to be reselected for different usage scenarios. The effectiveness of imitation learning [9,10] with dataset limitations depends on the size and features of the dataset, and models trained based on imitation learning have a certain confidence level in prediction and may not make the best decisions. The complex relationship between the cost function or reward function and the optimal decision needs to be determined in reinforcement learning [11], and the ideal cost function is difficult to implement in practice.…”
Section: Introductionmentioning
confidence: 99%
“…• Uncertainty-Aware Data Aggregation for Deep Imitation Learning (UAIL) [39]: UAIL gathers training data by estimating the output's uncertainty at a sub-optimal state. Monte Carlo Dropout is used for uncertainty estimation, in which the output distribution is computed using multiple dropout masks at each level, then the statistics of this distribution are used to calculate the uncertainty score, which is then compared to the uncertainty threshold.…”
Section: Uncertainity Detection and Data Aggregationmentioning
confidence: 99%
“…Effective for less complex DNN Fuzzy neural network Trajectory tracking under varied dynamics [39] The online learning of a pretrained deep fuzzy neural network-based controller improves the control of nonlinear system under diverse and varied operating conditions (i.e different payloads, height, and speed).…”
Section: Fast Adaptationmentioning
confidence: 99%
“…In their approach, they gathered human demonstrations for grasping the sheet and failure detection, by utilizing pre-trained YOLO features in order to facilitate the learning of deep neural network policies. Other works on the execution of folding cloths can be found in [199]- [203]. Instead of improving the synthetic objects to be indistinguishable from real objects, Abolghasemi and Bölöni [204] have trained the vision system to accept synthetic objects as real.…”
Section: Robots Learning From Demonstrationmentioning
confidence: 99%