Existing multimodal conversation agents have shown impressive abilities to locate absolute positions or retrieve attributes in simple scenarios, but they fail to perform well when complex relative positions and information alignments are involved, which poses a bottleneck in response quality. In this paper, we propose a Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph (SPRING) with abilities of reasoning multi-hops spatial relations and connecting them with visual attributes in crowded situated scenarios. Specifically, we design two types of Multimodal Question Answering (MQA) tasks to pretrain the agent. All QA pairs utilized during pretraining are generated from novel Increment Layout Graphs (ILG). QA pair difficulty labels automatically annotated by ILG are used to promote MQA-based Curriculum Learning. Experimental results verify the SPRING's effectiveness, showing that it significantly outperforms state-of-the-art approaches on both SIMMC 1.0 and SIMMC 2.0 datasets. We release our code and data at https://github.com/LYX0501/SPRING.
Cross-domain slot filling focuses on using labeled data from source domains to train a slot filling model for target domains. It is of great significance for transferring a dialogue system into new domains. Most of the existing work focused on building a cross-domain transfer model. From the perspective of slots themselves, this paper proposes a model-agnostic Slot Transferability Measure (STM) for evaluating the transferability from a source slot to a target slot, specifically, the degree that labeled data of the source slot is helpful to train the slot filling model for the target slot. We also give a STM-based method for a model to select helpful source slots and their labeled data for a given target slot. Experimental results on multiple existing models and datasets show that our method significantly outperforms state-ofthe-art baselines in cross-domain slot filling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.