It is better to Verify: Semi-Supervised Learning with a human in the loop for large-scale NLU models

Weber, Verena; Piovano, Enrico; Bradford, Melanie

doi:10.18653/v1/2021.dash-1.2

Cited by 4 publications

(5 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, Promptsource (Bach et al, 2022), is a framework designed to try out diverse set of prompts that can be used in in-context learning , or instruction tuning (Sanh et al, 2021;Min et al, 2021;Ye et al, 2022;Jang et al, 2023). Other human-in-the-loop annotation toolkits Weber et al, 2021; provides functionality for annotators to verify the neural model's prediction instead of manually creating them. Compared to these toolkits, CoTEver 1 Example from StrategyQA (Geva et al, 2021b) provides additional features specifically designed for gathering explanation data such as retrieving evidence documents and supporting different Chain of Thought prompts.…”

Section: Tool-kits For Data Annotationmentioning

confidence: 99%

“…In this paper, we address the question: can we gather explanation data in a more efficient manner? Inspired by human-in-the-loop methods, we ask annotators to verify a machine generated explanation instead of manually writing them Weber et al, 2021;. In other words, annotators get to check whether the underlying language model hallucinate (i.e., generate explanations that are factually incorrect) (Shuster et al, 2021;Lin et al, 2022a).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

2023

View full text Add to dashboard Cite

Open-retrieval question answering systems are generally trained and tested on large datasets in well-established domains. However, lowresource settings such as new and emerging domains would especially benefit from reliable question answering systems. Furthermore, multilingual and cross-lingual resources in emergent domains are scarce, leading to few or no such systems. In this paper, we demonstrate a cross-lingual open-retrieval question answering system for the emergent domain of COVID-19. Our system adopts a corpus of scientific articles to ensure that retrieved documents are reliable. To address the scarcity of cross-lingual training data in emergent domains, we present a method utilizing automatic translation, alignment, and filtering to produce English-to-all datasets. We show that a deep semantic retriever greatly benefits from training on our English-to-all data and significantly outperforms a BM25 baseline in the cross-lingual setting. We illustrate the capabilities of our system with examples and release all code necessary to train and deploy such a system 1 .

show abstract

Section: Tool-kits For Data Annotationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

2023

View full text Add to dashboard Cite

show abstract

“…For instance, Promptsource (Bach et al, 2022), is a framework designed to try out diverse set of prompts that can be used in in-context learning , or instruction tuning (Sanh et al, 2021;Wei et al, 2021;Min et al, 2021;Ye et al, 2022;Jang et al, 2023). Other human-in-the-loop annotation toolkits (Wallace et al, 2019;Weber et al, 2021; provides functionality for annotators to verify the neural model's prediction instead of manually creating them. Compared to these toolkits, CoTEver 1 Example from StrategyQA (Geva et al, 2021b) provides additional features specifically designed for gathering explanation data such as retrieving evidence documents and supporting different Chain of Thought prompts.…”

Section: Tool-kits For Data Annotationmentioning

confidence: 99%

“…In this paper, we address the question: can we gather explanation data in a more efficient manner? Inspired by human-in-the-loop methods, we ask annotators to verify a machine generated explanation instead of manually writing them (Wallace et al, 2019;Weber et al, 2021;. In other words, annotators get to check whether the underlying language model hallucinate (i.e., generate explanations that are factually incorrect) (Shuster et al, 2021;Lin et al, 2022a).…”

Section: Introductionmentioning

confidence: 99%

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification

Kim,

Joo,

Jang

et al. 2023

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrati

View full text Add to dashboard Cite

Chain-of-thought (CoT) prompting enables large language models (LLMs) to solve complex reasoning tasks by generating an explanation before the final prediction. Despite it's promising ability, a critical downside of CoT prompting is that the performance is greatly affected by the factuality of the generated explanation. To improve the correctness of the explanations, fine-tuning language models with explanation data is needed. However, there exists only a few datasets that can be used for such approaches, and no data collection tool for building them. Thus, we introduce CoTEVer, a tool-kit for annotating the factual correctness of generated explanations and collecting revision data of wrong explanations. Furthermore, we suggest several use cases where the data collected with CoTEVer can be utilized for enhancing the faithfulness of explanations. Our toolkit is publicly available at https://github.com/SeungoneKim/CoTEVer.

show abstract

“…For example, in the medical domain, it is essential that machine learning models forward x-ray images that are difficult to classify (i.e., the model is uncertain about the prediction) to physicians for manual inspection. In the literature, several setups exist in which human experts augment and complement ML models: HITL systems are employed in supervised learning (e.g., Wang et al 2016, Kamar 2016, Wu, Xiao, Sun, Zhang, Ma & He 2021, semi-supervised learning (e.g., Wrede & Hellander 2019, Weber et al 2021, and reinforcement learning (e.g., Wu et al 2022, Liang et al 2017, Elmalaki 2021). However, these approaches generally require repetitive human effort that is growing with the number of unknown instances and the inaccuracy in detecting such instances.…”

Section: Human-in-the-loop Systemsmentioning

confidence: 99%

Designing a Human-in-the-Loop System for Object Detection in Floor Plans

Jakubik

Hemmer

Vössing

et al. 2022

AAAI

View full text Add to dashboard Cite

In recent years, companies in the Architecture, Engineering, and Construction (AEC) industry have started exploring how artificial intelligence (AI) can reduce time-consuming and repetitive tasks. One use case that can benefit from the adoption of AI is the determination of quantities in floor plans. This information is required for several planning and construction steps. Currently, the task requires companies to invest a significant amount of manual effort. Either digital floor plans are not available for existing buildings, or the formats cannot be processed due to lack of standardization. In this paper, we therefore propose a human-in-the-loop approach for the detection and classification of symbols in floor plans. The developed system calculates a measure of uncertainty for each detected symbol which is used to acquire the knowledge of human experts for those symbols that are difficult to classify. We evaluate our approach with a real-world dataset provided by an industry partner and find that the selective acquisition of human expert knowledge enhances the model’s performance by up to 10.5%—resulting in an overall prediction accuracy of 92.1% on average. We further design a pipeline for the generation of synthetic training data that allows the systems to be adapted to new construction projects with minimal manual effort. Overall, our work supports professionals in the AEC industry on their journey to the data-driven generation of business value.

show abstract

It is better to Verify: Semi-Supervised Learning with a human in the loop for large-scale NLU models

Cited by 4 publications

References 14 publications

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification

Designing a Human-in-the-Loop System for Object Detection in Floor Plans

Contact Info

Product

Resources

About