A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions

Udagawa, Takuma; Yamazaki, Takato; Aizawa, Akiko

doi:10.18653/v1/2020.findings-emnlp.67

Cited by 7 publications

(5 citation statements)

References 55 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Spatial reasoning is a cognitive process based on the construction of mental representations for spatial objects, relations, and transformations (Clements and Battista, 1992), which is necessary for many natural language understanding (NLU) tasks such as natural language navigation Roman Roman et al, 2020;Kim et al, 2020), human-machine interaction (Landsiedel et al, 2017;Roman Roman et al, 2020), dialogue systems (Udagawa et al, 2020), and clinical analysis (Datta and Roberts, 2020).…”

Section: Introductionmentioning

confidence: 99%

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

Mirzaee¹,

Faghihi²,

Ning³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art language models (LM). We propose a distant supervision method to improve on this task. Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. Experiments show that further pretraining LMs on these automatically generated data significantly improves LMs' capability on spatial understanding, which in turn helps to better solve two external datasets, bAbI, and boolQ. We hope that this work can foster investigations into more sophisticated models for spatial reasoning over text.

show abstract

Section: Introductionmentioning

confidence: 99%

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

Mirzaee¹,

Faghihi²,

Ning³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

“…Referring expressions (e.g., the red one) often can only be resolved in a visual context, and deictic expressions, like English here, there, this and that, are frequently used in language to individuate referents in their immediate context, relying on mutual knowledge of what the speaker and listener can see (Clark and Marshall, 1981). Reference intepretation can also be affected by the location of the speaker and hearer in the world (Birner, 2012), and can involve physical analogues of implicature (e.g., the black one might be a good description for a dark grey object if all other visible objects are lighter) (Golland et al, 2010;Udagawa et al, 2020).…”

Section: A2 Multimodal Contextmentioning

confidence: 99%

Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling Approaches

Fried,

Tomlin,

et al. 2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

People rely heavily on context to enrich meaning beyond what is literally said, enabling concise but effective communication. To interact successfully and naturally with people, LLMs and other user-facing NLP systems will require similar skills in pragmatics: relying on various types of context-from shared linguistic goals and conventions, to the visual and embodied world-to use language effectively. We survey existing grounded settings and pragmatic modeling approaches and analyze how the task goals, environmental contexts, and communicative affordances in each work enrich linguistic meaning. We present recommendations for future grounded task design to naturally elicit pragmatic phenomena, and suggest directions that focus on a broader range of communicative contexts and affordances.Paul Bloom. 2002. How children learn the meanings of words. MIT press.

show abstract

“…Existing works on spatial semantics have focused on natural language navigation (Chen et al, 2019;Kim et al, 2020), human-machine interaction (Landsiedel et al, 2017;Roman Roman et al, 2020), dialogue systems (Udagawa et al, 2020), and clinical analysis (Kordjamshidi et al, 2015;Datta and Roberts, 2020). Works on geocoding (Gritta et al, 2018;Kulkarni et al, 2020) map spatial mentions to coordinates, which can be applied to our work for finer geolocation mapping.…”

Section: Related Workmentioning

confidence: 99%

A Meta-framework for Spatiotemporal Quantity Extraction from Text

Niu¹,

Zhou²,

Wu³

et al. 2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

News events are often associated with quantities (e.g., the number of COVID-19 patients or the number of arrests in a protest), and it is often important to extract their type, time, and location from unstructured text in order to analyze these quantity events. This paper thus formulates the NLP problem of spatiotemporal quantity extraction, and proposes the first meta-framework for solving it. This meta-framework contains a formalism that decomposes the problem into several information extraction tasks, a shareable crowdsourcing pipeline, and transformer-based baseline models. We demonstrate the meta-framework in three domains-the COVID-19 pandemic, Black Lives Matter protests, and 2020 California wildfires-to show that the formalism is general and extensible, the crowdsourcing pipeline facilitates fast and high-quality data annotation, and the baseline system can handle spatiotemporal quantity extraction well enough to be practically useful. We release all resources for future research on this topic. 1 * * Work started while at the Allen Institute for AI 1 https://github.com/steqe DCT: Thursday, 08/27/2020 Title: Study Sessions, Dinners: 104 New USC Student Coronavirus Cases Text: LOS ANGELES , CA --The number of coronavirus cases confirmed among USC students continued rising Thursday, with the university announcing [104] new cases over the past four days… Recognition: 104 Type: Confirmed cases Spatial Grounding: US à California à Los Angeles à USC Temporal Grounding: [08/23/2020, 08/26/2020] DCT:

show abstract

A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions

Cited by 7 publications

References 55 publications

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling Approaches

A Meta-framework for Spatiotemporal Quantity Extraction from Text

Contact Info

Product

Resources

About