“…Language-based interaction has been studied in the context of visual question answering (de Vries et al, 2017;Chattopadhyay et al, 2017;Lee et al, 2019;Shukla et al, 2019), SQL generation (Gur et al, 2018;Yao et al, 2019), information retrieval (Chung et al, 2018;Aliannejadi et al, 2019) and multi-turn textbased question answering (Rao and Daumé III, 2018;Reddy et al, 2019;Choi et al, 2018). Most methods require learning from recorded dialogues Hu et al, 2018;Rao and Daumé III, 2018) or conducting Wizard-of-Oz dialog annotations (Kelley, 1984;Wen et al, 2017). Instead, we limit the interaction to multiple-choice and binary questions.…”