“…Synsets Figure 4 shows the counts of examples per synset in the training and development sets. Image Pair Reasoning We use a 200-sentence subset of the sentences analyzed in Table 5 (3) existential and (4) universal quantifiers; (5) coordination; (6) coreference; (7) spatial relations; (8) presupposition; (9) preposition attachment ambiguity VQA1.0 (Antol et al, 2015), VQA-CP (Agrawal et al, 2017), VQA2.0 (Goyal et al, 2017) Visual Question Answering Referring Expression Generation VQA (Abstract) (Zitnick and Parikh, 2013) Visual Question Answering ReferItGame (Kazemzadeh et al, 2014) Referring Expression Resolution SHAPES (Andreas et al, 2016) Visual Question Answering Bisk et al (2016) Instruction Following MSCOCO (Chen et al, 2016) Caption Generation Google RefExp (Mao et al, 2016) Referring Expression Resolution ROOM-TO-ROOM (Anderson et al, 2018) Instruction Following Visual Dialog (Das et al, 2017) Dialogue Visual Question Answering CLEVR (Johnson et al, 2017a) Visual Question Answering CLEVR-Humans (Johnson et al, 2017b) Visual Question Answering TDIUC (Kafle and Kanan, 2017) Visual Question Answering ShapeWorld (Kuhnle and Copestake, 2017) Binary Sentence Classification FigureQA (Kahou et al, 2018) Visual Question Answering TVQA (Lei et al, 2018) Video Question Answering LANI & CHAI (Misra et al, 2018) Instruction Following Talk the Walk (de Vries et al, 2018) Dialogue Instruction Following COG (Yang et al, 2018) Visual Question Answering; Instruction Following VCR (Zellers et al, 2019) Visual Question Answering TallyQA (Acharya et al, 2019) Visual Question Answering What to avoid…”