Data-Efficient Paraphrase Generation to Bootstrap Intent Classification and Slot Labeling for New Features in Task-Oriented Dialog Systems

Jolly, Shailza; Falke, Tobias; Tırkaz, Çağlar; Sorokin, Daniil

doi:10.18653/v1/2020.coling-industry.2

Cited by 9 publications

(13 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our approach is focused towards generating utterances in the dialog domain that can generate utterances from a sequence of slots conditioned on both intent and language. Jolly et al (2020) showed that an interpretationto-text model can be used with shuffling-based sampling techniques to generate diverse and novel paraphrases from small amounts of seed data, that improve accuracy when augmenting to the existing training data. Our approach is different as our model can generate the slot annotations along with the the utterance, which are necessary for the slot labeling task.…”

Section: Related Workmentioning

confidence: 99%

“…However, labeled examples for the new feature are typically limited to a small set of seed examples, as the collection of more annotations would make feature expansion costly and slow. As a possible solution, previous work explored the automatic generation of paraphrases to augment the seed data (Malandrakis et al, 2019;Cho et al, 2019;Jolly et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

“…To address this setup, we follow the recent work of Jolly et al (2020), which proposes to use an encoder-decoder model that maps from structured meaning representations to corresponding utterances. Because such an input is language-agnostic, it is particularly well-suited for the multilingual setup.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems

Panda¹,

Tırkaz²,

Falke³

et al. 2021

Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

Self Cite

View full text Add to dashboard Cite

The lack of labeled training data for new features is a common problem in rapidly changing real-world dialog systems. As a solution, we propose a multilingual paraphrase generation model that can be used to generate novel utterances for a target feature and target language. The generated utterances can be used to augment existing training data to improve intent classification and slot labeling models. We evaluate the quality of generated utterances using intrinsic evaluation metrics and by conducting downstream evaluation experiments with English as the source language and nine different target languages. Our method shows promise across languages, even in a zero-shot setting where no seed data is available.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems

Panda¹,

Tırkaz²,

Falke³

et al. 2021

Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

Self Cite

View full text Add to dashboard Cite

show abstract

“…Intent classification and slot labeling are two fundamental tasks in spoken language understanding, dating back to early 90's (Price, 1990). With the rise of task-oriented personal assistants, the two tasks got more attention and progress has been made by applying various deep learning techniques (Abujabal and Gaspers, 2019;Goo et al, 2018; Jolly et al, 2020;Mesnil et al, 2013;Zhang and Wang, 2016). While we focus on resolving annotation conflicts for NLU with linear labeling i.e., intent and slot labels, our approach can be still used for other more complex tree-based labeling e.g., labeling dependency parses or ontology trees (Chen and Manning, 2014), with the minor change of replacing the task-specific neural LSTM-based classification model.…”

Section: Related Workmentioning

confidence: 99%

“…PlayMusic), and (2) A slot labeling (SL) model, which classifies tokens into slot types, out of a predefined set (e.g. SongName) (Goo et al, 2018;Jolly et al, 2020). An example utterance is shown in Figure 1, with two conflicting annotations.…”

Section: Introductionmentioning

confidence: 99%

Identifying and Resolving Annotation Changes for Natural Language Understanding

Ramas¹,

Pessot²,

Abujabal³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Annotation conflict resolution is crucial towards building machine learning models with acceptable performance. Past work on annotation conflict resolution had assumed that data is collected at once, with a fixed set of annotators and fixed annotation guidelines. Moreover, previous work dealt with atomic labeling tasks. In this paper, we address annotation conflict resolution for Natural Language Understanding (NLU), a structured prediction task, in a real-world setting of commercial voice-controlled personal assistants, where (1) regular data collections are needed to support new and existing functionalities, (2) annotation guidelines evolve over time, and (3) the pool of annotators changes across data collections. We devise an approach combining information-theoretic measures and a supervised neural model to resolve conflicts in data annotation. We evaluate our approach both intrinsically and extrinsically on a real-world dataset with 3.5M utterances of a commercial dialog system in German. Our approach leads to dramatic improvements over a majority baseline especially in contentious cases. On the NLU task, our approach achieves 2.75% error reduction over a no-resolution baseline.

show abstract

A Contrastive learning-based Task Adaptation model for few-shot intent recognition

Zhang

Cai

et al. 2022

Information Processing & Management

View full text Add to dashboard Cite

Data-Efficient Paraphrase Generation to Bootstrap Intent Classification and Slot Labeling for New Features in Task-Oriented Dialog Systems

Cited by 9 publications

References 20 publications

Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems

Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems

Identifying and Resolving Annotation Changes for Natural Language Understanding

A Contrastive learning-based Task Adaptation model for few-shot intent recognition

Contact Info

Product

Resources

About