2021
DOI: 10.1007/978-3-030-91608-4_20
|View full text |Cite
|
Sign up to set email alerts
|

New Arabic Medical Dataset for Diseases Classification

Abstract: The Arabic language suffers from a great shortage of datasets suitable for training deep learning models, and the existing ones include general non-specialized classifications. In this work, we introduce a new Arab medical dataset, which includes two thousand medical documents collected from several Arabic medical websites, in addition to the Arab Medical Encyclopedia. The dataset was built for the task of classifying texts and includes 10 classes (Blood, Bone, Cardiovascular, Ear, Endocrine, Eye, Gastrointest… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 43 publications
(40 reference statements)
0
6
0
Order By: Relevance
“…Unfortunately, many corpora in LoE remain unavailable to the public for various reasons (ethics, data sensitivity, company policy, etc.). Nevertheless, they are often featured in publications that carry detailed and valuable information on the specificities of a particular LoE (Ukrainian [35]); the resource selection (Arabic [36]); the annotation process (Tibetan [37]), or the evaluation of different machine learning methods (French [38]).…”
Section: New Multilingual Resources and Monolingual Datasets In Loementioning
confidence: 99%
“…Unfortunately, many corpora in LoE remain unavailable to the public for various reasons (ethics, data sensitivity, company policy, etc.). Nevertheless, they are often featured in publications that carry detailed and valuable information on the specificities of a particular LoE (Ukrainian [35]); the resource selection (Arabic [36]); the annotation process (Tibetan [37]), or the evaluation of different machine learning methods (French [38]).…”
Section: New Multilingual Resources and Monolingual Datasets In Loementioning
confidence: 99%
“…In addition, (Abdelhay et al, 2023) tackled the challenges of implementing medical bots in Arabic with the introduction of the MAQA dataset, high-lighting the effectiveness of Transformer models. (Hammoud et al, 2020) fine-tuned neural networks for medical entity recognition in Arabic medical texts, while (Hammoud et al, 2021) presented a novel dataset for disease classification, emphasizing the potential of pre-trained models. Finally, (Samy et al, 2012) compared strategies for medical term extraction, revealing the advantages of using Arabic equivalents of Latin prefixes and suffixes.…”
Section: Related Workmentioning
confidence: 99%
“…Hammoud et al [12] presented a new Arabic medical dataset for text classification. The dataset included 2,000 articles over 10 classes (blood, bone, cardiovascular, ear, endocrine, eye, gastrointestinal, immune, liver, and nephrological) of disease.…”
Section: Related Workmentioning
confidence: 99%