Libo Qin scite author profile

Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. The two tasks are closely tied and the slots often highly depend on the intent. In this paper, we propose a novel framework for SLU to better incorporate the intent information, which further guides the slot filling. In our framework, we adopt a joint model with Stack-Propagation which can directly use the intent information as input for slot filling, thus to capture the intent semantic knowledge. In addition, to further alleviate the error propagation, we perform the token-level intent detection for the Stack-Propagation framework. Experiments on two publicly datasets show that our model achieves the state-of-the-art performance and outperforms other previous methods by a large margin. Finally, we use the Bidirectional Encoder Representation from Transformer (BERT) model in our framework, which further boost our performance in SLU task.

show abstract

AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling

Qin¹,

Xiao²,

Che³

et al. 2020

View full text Add to dashboard Cite

In real-world scenarios, users usually have multiple intents in the same utterance. Unfortunately, most spoken language understanding (SLU) models either mainly focused on the single intent scenario, or simply incorporated an overall intent context vector for all tokens, ignoring the fine-grained multiple intents information integration for token-level slot prediction. In this paper, we propose an Adaptive Graph-Interactive Framework (AGIF) for joint multiple intent detection and slot filling, where we introduce an intent-slot graph interaction layer to model the strong correlation between the slot and intents. Such an interaction layer is applied to each token adaptively, which has the advantage to automatically extract the relevant intents information, making a fine-grained intent information integration for the token-level slot prediction. Experimental results on three multiintent datasets show that our framework obtains substantial improvement and achieves the state-of-the-art performance. In addition, our framework achieves new state-of-the-art performance on two single-intent datasets.

show abstract

DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification

Qin

Che

et al. 2020

AAAI

View full text Add to dashboard Cite

In dialog system, dialog act recognition and sentiment classification are two correlative tasks to capture speakers' intentions, where dialog act and sentiment can indicate the explicit and the implicit intentions separately (Kim and Kim 2018). Most of the existing systems either treat them as separate tasks or just jointly model the two tasks by sharing parameters in an implicit way without explicitly modeling mutual interaction and relation. To address this problem, we propose a Deep Co-Interactive Relation Network (DCR-Net) to explicitly consider the cross-impact and model the interaction between the two tasks by introducing a co-interactive relation layer. In addition, the proposed relation layer can be stacked to gradually capture mutual knowledge with multiple steps of interaction. Especially, we thoroughly study different relation layers and their effects. Experimental results on two public datasets (Mastodon and Dailydialog) show that our model outperforms the state-of-the-art joint model by 4.3% and 3.4% in terms of F1 score on dialog act recognition task, 5.7% and 12.4% on sentiment classification respectively. Comprehensive analysis empirically verifies the effectiveness of explicitly modeling the relation between the two tasks and the multi-steps interaction mechanism. Finally, we employ the Bidirectional Encoder Representation from Transformer (BERT) in our framework, which can further boost our performance in both tasks.

show abstract

Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog

Qin¹,

Xiao²,

Che³

et al. 2020

View full text Add to dashboard Cite

Recent studies have shown remarkable success in end-to-end task-oriented dialog system. However, most neural models rely on large training data, which are only available for a certain number of task domains, such as navigation and scheduling. This makes it difficult to scalable for a new domain with limited labeled data. However, there has been relatively little research on how to effectively use data from all domains to improve the performance of each domain and also unseen domains. To this end, we investigate methods that can make explicit use of domain knowledge and introduce a shared-private network to learn shared and specific knowledge. In addition, we propose a novel Dynamic Fusion Network (DF-Net) which automatically exploit the relevance between the target domain and each domain. Results show that our model outperforms existing methods on multi-domain dialogue, giving the state-of-the-art in the literature. Besides, with little training data, we show its transferability by outperforming prior best model by 13.9% on average. * Email corresponding. Address Distance POI type POI Traffic info 5672 barringer street 5 miles certain address 5672 barringer street no traffic 200 Alester Ave 2 miles gas station Valero road block nearby 899 Ames Ct 5 miles hospital Stanford Childrens Health moderate traffic 481 Amaranta Ave 1 miles parking garage Palo Alto Garage R moderate traffic Driver Address to the gas station. Dialogue Knowledge Base (KB) CarValero is located at 200 Alester Ave. CarSince there is a road block nearby, I found another route for you and I sent it on your screen.Driver OK , please give me directions via a route that avoids all heavy traffic.

show abstract

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP

Qin

Zhang

et al. 2020

View full text Add to dashboard Cite

Multi-lingual contextualized embeddings, such as multilingual-BERT (mBERT), have shown success in a variety of zero-shot cross-lingual tasks. However, these models are limited by having inconsistent contextualized representations of subwords across different languages. Existing work addresses this issue by bilingual projection and fine-tuning technique. We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT, which encourages model to align representations from source and multiple target languages once by mixing their context information. Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages. Experimental results on five tasks with 19 languages show that our method leads to significantly improved performances for all the tasks compared with mBERT.

show abstract

Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever

Qin¹,

Liu²,

Che³

et al. 2019

View full text Add to dashboard Cite

Querying the knowledge base (KB) has long been a challenge in the end-to-end taskoriented dialogue system. Previous sequenceto-sequence (Seq2Seq) dialogue generation work treats the KB query as an attention over the entire KB, without the guarantee that the generated entities are consistent with each other. In this paper, we propose a novel framework which queries the KB in two steps to improve the consistency of generated entities. In the first step, inspired by the observation that a response can usually be supported by a single KB row, we introduce a KB retrieval component which explicitly returns the most relevant KB row given a dialogue history. The retrieval result is further used to filter the irrelevant entities in a Seq2Seq response generation model to improve the consistency among the output entities. In the second step, we further perform the attention mechanism to address the most correlated KB column. Two methods are proposed to make the training feasible without labeled retrieval data, which include distant supervision and Gumbel-Softmax technique. Experiments on two publicly available task oriented dialog datasets show the effectiveness of our model by outperforming the baseline systems and producing entity-consistent responses. ) = softmax (3 4 + 6 ) AddressDistance POI type POI Traffic info

show abstract

GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling

Qin¹,

Wei²,

Xie³

et al. 2021

View full text Add to dashboard Cite

Multi-intent SLU can handle multiple intents in an utterance, which has attracted increasing attention. However, the state-of-the-art joint models heavily rely on autoregressive approaches, resulting in two issues: slow inference speed and information leakage. In this paper, we explore a non-autoregressive model for joint multiple intent detection and slot filling, achieving more fast and accurate. Specifically, we propose a Global-Locally Graph Interaction Network (GL-GIN) where a local slot-aware graph interaction layer is proposed to model slot dependency for alleviating uncoordinated slots problem while a global intentslot graph interaction layer is introduced to model the interaction between multiple intents and all slots in the utterance. Experimental results on two public datasets show that our framework achieves state-of-the-art performance while being 11.5 times faster.

show abstract

A Co-Interactive Transformer for Joint Slot Filling and Intent Detection

Qin

Liu

Che

et al. 2021

View full text Add to dashboard Cite

Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. The two tasks are closely related and the information of one task can benefit the other. Previous studies either implicitly model the two tasks with multi-task framework or only explicitly consider the single information flow from intent to slot. None of the prior approaches model the bidirectional connection between the two tasks simultaneously in a unified framework. In this paper, we propose a Co-Interactive Transformer which considers the cross-impact between the two tasks. Instead of adopting the self-attention mechanism in vanilla Transformer, we propose a co-interactive module to consider the cross-impact by building a bidirectional connection between the two related tasks, where slot and intent can be able to attend on the corresponding mutual information. The experimental results on two public datasets show that our model achieves the state-of-the-art performance.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Libo Qin

A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding

AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling

DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification

Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP

Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever

GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling

A Co-Interactive Transformer for Joint Slot Filling and Intent Detection

Contact Info

Product

Resources

About