Abstract:This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices. The embedded inference is fast and accurate while enforcing privacy by design, as no personal user data is ever collected. Focusing on Automatic Speech Recognition and Natural Language Understanding, we detail our approach to training high-performance Machine Learning models that are small enough to run in real-time on small d… Show more
In this work, we focus on a more challenging few-shot intent detection scenario where many intents are fine-grained and semantically similar. We present a simple yet effective fewshot intent detection schema via contrastive pre-training and fine-tuning. Specifically, we first conduct self-supervised contrastive pretraining on collected intent datasets, which implicitly learns to discriminate semantically similar utterances without using any labels. We then perform few-shot intent detection together with supervised contrastive learning, which explicitly pulls utterances from the same intent closer and pushes utterances across different intents farther. Experimental results show that our proposed method achieves state-of-the-art performance on three challenging intent detection datasets under 5shot and 10-shot settings.
In this work, we focus on a more challenging few-shot intent detection scenario where many intents are fine-grained and semantically similar. We present a simple yet effective fewshot intent detection schema via contrastive pre-training and fine-tuning. Specifically, we first conduct self-supervised contrastive pretraining on collected intent datasets, which implicitly learns to discriminate semantically similar utterances without using any labels. We then perform few-shot intent detection together with supervised contrastive learning, which explicitly pulls utterances from the same intent closer and pushes utterances across different intents farther. Experimental results show that our proposed method achieves state-of-the-art performance on three challenging intent detection datasets under 5shot and 10-shot settings.
“…Our platform supports standard benchmark datasets for intent recognition, including CLINC (Larson et al, 2019), BANKING (Casanueva et al, 2020), SNIPS (Coucke et al, 2018), and StackOverflow (Xu et al, 2015). They are all split into training, evaluation and test sets.…”
TEXTOIR is the first integrated and visualized platform for text open intent recognition. It is composed of two main modules: open intent detection and open intent discovery. Each module integrates most of the state-of-the-art algorithms and benchmark intent datasets. It also contains an overall framework connecting the two modules in a pipeline scheme. In addition, this platform has visualized tools for data and model management, training, evaluation and analysis of the performance from different aspects. TEXTOIR provides useful toolkits and convenient visualized interfaces for each sub-module 1 , and designs a framework to implement a complete process to both identify known intents and discover open intents 2 .
“…Dataset We conduct experiments on two public datasets: SNIPS [3] (in English) and Few-Joint [10] (in Chinese). For SNIPS, we use the data split 3 of 5-shot setting without intent classification task.…”
Data sparsity problem is a key challenge of Natural Language Understanding (NLU), especially for a new target domain. By training an NLU model in source domains and applying the model to an arbitrary target domain directly (even without fine-tuning), few-shot NLU becomes crucial to mitigate the data scarcity issue. In this paper, we propose to improve prototypical networks with vector projection distance and abstract triangular Conditional Random Field (CRF) for the few-shot NLU. The vector projection distance exploits projections of contextual word embeddings on label vectors as word-label similarities, which is equivalent to a normalized linear model. The abstract triangular CRF learns domain-agnostic label transitions for joint intent classification and slot filling tasks. Extensive experiments demonstrate that our proposed methods can significantly surpass strong baselines. Specifically, our approach can achieve a new state-of-the-art on two few-shot NLU benchmarks (Few-Joint and SNIPS) in Chinese and English without fine-tuning on target domains.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.