Proceedings of the Second Workshop on Data Science With Human in the Loop: Language Advances 2021
DOI: 10.18653/v1/2021.dash-1.2
|View full text |Cite
|
Sign up to set email alerts
|

It is better to Verify: Semi-Supervised Learning with a human in the loop for large-scale NLU models

Abstract: When a NLU model is updated, new utterances must be annotated to be included for training. However, manual annotation is very costly. We evaluate a semi-supervised learning workflow with a human in the loop in a production environment. The previous NLU model predicts the annotation of the new utterances, a human then reviews the predicted annotation. Only when the NLU prediction is assessed as incorrect the utterance is sent for human annotation. Experimental results show that the proposed workflow boosts the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 14 publications
0
1
0
Order By: Relevance
“…For instance, Promptsource (Bach et al, 2022), is a framework designed to try out diverse set of prompts that can be used in in-context learning , or instruction tuning (Sanh et al, 2021;Min et al, 2021;Ye et al, 2022;Jang et al, 2023). Other human-in-the-loop annotation toolkits Weber et al, 2021; provides functionality for annotators to verify the neural model's prediction instead of manually creating them. Compared to these toolkits, CoTEver 1 Example from StrategyQA (Geva et al, 2021b) provides additional features specifically designed for gathering explanation data such as retrieving evidence documents and supporting different Chain of Thought prompts.…”
Section: Tool-kits For Data Annotationmentioning
confidence: 99%
See 1 more Smart Citation
“…For instance, Promptsource (Bach et al, 2022), is a framework designed to try out diverse set of prompts that can be used in in-context learning , or instruction tuning (Sanh et al, 2021;Min et al, 2021;Ye et al, 2022;Jang et al, 2023). Other human-in-the-loop annotation toolkits Weber et al, 2021; provides functionality for annotators to verify the neural model's prediction instead of manually creating them. Compared to these toolkits, CoTEver 1 Example from StrategyQA (Geva et al, 2021b) provides additional features specifically designed for gathering explanation data such as retrieving evidence documents and supporting different Chain of Thought prompts.…”
Section: Tool-kits For Data Annotationmentioning
confidence: 99%
“…In this paper, we address the question: can we gather explanation data in a more efficient manner? Inspired by human-in-the-loop methods, we ask annotators to verify a machine generated explanation instead of manually writing them Weber et al, 2021;. In other words, annotators get to check whether the underlying language model hallucinate (i.e., generate explanations that are factually incorrect) (Shuster et al, 2021;Lin et al, 2022a).…”
Section: Introductionmentioning
confidence: 99%
“…For instance, Promptsource (Bach et al, 2022), is a framework designed to try out diverse set of prompts that can be used in in-context learning , or instruction tuning (Sanh et al, 2021;Wei et al, 2021;Min et al, 2021;Ye et al, 2022;Jang et al, 2023). Other human-in-the-loop annotation toolkits (Wallace et al, 2019;Weber et al, 2021; provides functionality for annotators to verify the neural model's prediction instead of manually creating them. Compared to these toolkits, CoTEver 1 Example from StrategyQA (Geva et al, 2021b) provides additional features specifically designed for gathering explanation data such as retrieving evidence documents and supporting different Chain of Thought prompts.…”
Section: Tool-kits For Data Annotationmentioning
confidence: 99%
“…In this paper, we address the question: can we gather explanation data in a more efficient manner? Inspired by human-in-the-loop methods, we ask annotators to verify a machine generated explanation instead of manually writing them (Wallace et al, 2019;Weber et al, 2021;. In other words, annotators get to check whether the underlying language model hallucinate (i.e., generate explanations that are factually incorrect) (Shuster et al, 2021;Lin et al, 2022a).…”
Section: Introductionmentioning
confidence: 99%
“…For example, in the medical domain, it is essential that machine learning models forward x-ray images that are difficult to classify (i.e., the model is uncertain about the prediction) to physicians for manual inspection. In the literature, several setups exist in which human experts augment and complement ML models: HITL systems are employed in supervised learning (e.g., Wang et al 2016, Kamar 2016, Wu, Xiao, Sun, Zhang, Ma & He 2021, semi-supervised learning (e.g., Wrede & Hellander 2019, Weber et al 2021, and reinforcement learning (e.g., Wu et al 2022, Liang et al 2017, Elmalaki 2021). However, these approaches generally require repetitive human effort that is growing with the number of unknown instances and the inaccuracy in detecting such instances.…”
Section: Human-in-the-loop Systemsmentioning
confidence: 99%