“…We also build on the growing body of work that fine-tunes models with human feedback. This has been applied in many domains including summarization (Böhm et al, 2019;Ziegler et al, 2019;Stiennon et al, 2020), dialogue (Jaques et al, 2019;Yi et al, 2019;Hancock et al, 2019), translation (Kreutzer et al, 2018;Bahdanau et al, 2016), semantic parsing (Lawrence and Riezler, 2018), story generation (Zhou and Xu, 2020), review generation (Cho et al, 2018), and evidence extraction (Perez et al, 2019), and agents in simulated environments (Christiano et al, 2017;Ibarz et al, 2018).…”