Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.305
|View full text |Cite
|
Sign up to set email alerts
|

COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval

Abstract: We present a large, challenging dataset, COUGH, for COVID-19 FAQ retrieval. Similar to a standard FAQ dataset, COUGH consists of three parts: FAQ Bank, Query Bank and Relevance Set. The FAQ Bank contains ∼16K FAQ items scraped from 55 credible websites (e.g., CDC and WHO). For evaluation, we introduce Query Bank and Relevance Set, where the former contains 1,236 human-paraphrased queries while the latter contains ∼32 humanannotated FAQ items for each query. We analyze COUGH by testing different FAQ retrieval m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…Traditional security threats have prompted significant exploration into areas such as membership inference attacks (Shi et al, 2023b), backdoor attacks (Shi et al, 2023a;Xu et al, 2023), andothers (Wan et al, 2023;Shi et al, 2024). (Wang et al, 2023a;Huang et al, 2023c;Bi et al, 2023). A multitude of studies have extensively examined the trustworthiness of LLMs including the alignment (Wang et al, 2023b;Liu et al, 2023a), truthfulness (e.g., misinformation (Huang and Sun, 2023;Chen and Shu, 2023b,a) and hallucination (Xu et al, 2024;Tonmoy et al, 2024;Huang et al, 2023a;), accountability (He et al, 2024;, and fairness (Wang et al, 2023a;Huang et al, 2023c;Bi et al, 2023).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Traditional security threats have prompted significant exploration into areas such as membership inference attacks (Shi et al, 2023b), backdoor attacks (Shi et al, 2023a;Xu et al, 2023), andothers (Wan et al, 2023;Shi et al, 2024). (Wang et al, 2023a;Huang et al, 2023c;Bi et al, 2023). A multitude of studies have extensively examined the trustworthiness of LLMs including the alignment (Wang et al, 2023b;Liu et al, 2023a), truthfulness (e.g., misinformation (Huang and Sun, 2023;Chen and Shu, 2023b,a) and hallucination (Xu et al, 2024;Tonmoy et al, 2024;Huang et al, 2023a;), accountability (He et al, 2024;, and fairness (Wang et al, 2023a;Huang et al, 2023c;Bi et al, 2023).…”
Section: Related Workmentioning
confidence: 99%
“…(Wang et al, 2023a;Huang et al, 2023c;Bi et al, 2023). A multitude of studies have extensively examined the trustworthiness of LLMs including the alignment (Wang et al, 2023b;Liu et al, 2023a), truthfulness (e.g., misinformation (Huang and Sun, 2023;Chen and Shu, 2023b,a) and hallucination (Xu et al, 2024;Tonmoy et al, 2024;Huang et al, 2023a;), accountability (He et al, 2024;, and fairness (Wang et al, 2023a;Huang et al, 2023c;Bi et al, 2023).…”
Section: Related Workmentioning
confidence: 99%
“…Downstream Task Dataset Size. While the downstream task datasets may seem small, recent high-quality manually annotated datasets had similar sizes-COUGH dataset(1236 labeled sentences) (Zhang et al, 2021) and YASO dataset (2215 labeled sentences) (Orbach et al, 2021). Thus, the current size is comparable to the contemporaries.…”
Section: Xlnet5gmentioning
confidence: 98%
“…COUGH This is another open English dataset [30] constructed by scraping data from 55 websites (like, CDC and WHO) containing user queries and FAQs about Covid-19.…”
Section: Datasetsmentioning
confidence: 99%