BATS: Best Action Trajectory Stitching

Char, Ian; Mehta, Viraj; Villaflor, Adam; Dolan, John M.; Schneider, Jeff

doi:10.48550/arxiv.2204.12026

Search citation statements

Order By: Relevance

Paper Sections

Select...

Related Work1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(2 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Taking a different approach, [19] utilizes a model-based data augmentation strategy, stitching together parts of historical demonstrations to create superior trajectories. Similarly, the Best Action Trajectory Stitching (BATS) [9] algorithm forms a tabular Markov Decision Process over logged data, adding new transitions using short planned trajectories. BATS not only aids in identifying advantageous trajectories but also provides theoretical bounds on the value function.…”

Section: Related Workmentioning

confidence: 99%

“…DT utilizes a Transformer architecture to model and reproduce sequences from demonstrations, integrating a goal-conditioned policy to convert Offline RL into a supervised learning task. Despite its competitive performance in Offline RL tasks, the DT falls short in achieving trajectory stitching, a desirable property in Offline RL that refers to creating an optimal trajectory by combining parts of sub-optimal trajectories [19,9,57]. This limitation stems from the DT's inability to generate superior sequences, thus curbing its potential to learn optimal policies from sub-optimal trajectories (Figure 1).…”

mentioning

confidence: 99%

See 1 more Smart Citation