2022
DOI: 10.1109/access.2022.3157289
|View full text |Cite
|
Sign up to set email alerts
|

PersianQuAD: The Native Question Answering Dataset for the Persian Language

Abstract: Developing Question Answering systems (QA) is one of the main goals in Artificial Intelligence. With the advent of Deep Learning (DL) techniques, QA systems have witnessed significant advances. Although DL performs very well on QA, it requires a considerable amount of annotated data for training. Many annotated datasets have been built for the QA task; most of them are exclusively in English. In order to address the need for a high-quality QA dataset in the Persian language, we present PersianQuAD, the native … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 32 publications
(54 reference statements)
0
4
0
Order By: Relevance
“…Manual creation of questions and marking of answers to relevant paragraphs was usually done by crowdsourcing by trained workers. To create answers to the questions, either own annotation tools were used, such as AddQA [45], PI-AFanno [46], SAJAD [47], or already existing web crowdsourcing platforms such as Amazon Mechanical Turk [28], Toloka AI [40], or Prolific [48].…”
Section: ) Monolingual Question Answering Datasetsmentioning
confidence: 99%
“…Manual creation of questions and marking of answers to relevant paragraphs was usually done by crowdsourcing by trained workers. To create answers to the questions, either own annotation tools were used, such as AddQA [45], PI-AFanno [46], SAJAD [47], or already existing web crowdsourcing platforms such as Amazon Mechanical Turk [28], Toloka AI [40], or Prolific [48].…”
Section: ) Monolingual Question Answering Datasetsmentioning
confidence: 99%
“…There are Persian datasets for NLP tasks like questionanswering [12], [13], [14], language modeling [19], or sentiment analysis [20]. However, there is no Persian benchmark dataset for the NLU task.…”
Section: Description Of Persian Datasetmentioning
confidence: 99%
“…There are some machine reading comprehension datasets for Persian [ 66 , 67 ]. We build PASD by using the PersianQuAD dataset [ 67 ].…”
Section: Datasetmentioning
confidence: 99%
“…There are some machine reading comprehension datasets for Persian [ 66 , 67 ]. We build PASD by using the PersianQuAD dataset [ 67 ]. PersianQuAD is the first large-scale native machine reading comprehension dataset for question answering for the Persian language.…”
Section: Datasetmentioning
confidence: 99%