2020
DOI: 10.48550/arxiv.2012.15029
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations

Abstract: Most of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam. Out of this raw data, we release 18,000 images that were manually annotated by a total of 17 experienced radiolog… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
40
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
1
1

Relationship

3
7

Authors

Journals

citations
Cited by 25 publications
(40 citation statements)
references
References 16 publications
0
40
0
Order By: Relevance
“…We illustrate FCL on anterior and posterior chest X-rays (CXRs). We use three public largescale CXR datasets as the unlabeled pre-training data to simulate the federated dataset size # of classes multi-label multi-view balanced resolution CheXpert [11] 371920 14 390 × 320 ChestX-ray8 [26] environment, namely CheXpert [11], ChestX-ray8 [26], and VinDr-CXR [17] (see Table 1). Three datasets are collected and annotated from different sources independently and express a large variety in data modalities (see Fig.…”
Section: Methodsmentioning
confidence: 99%
“…We illustrate FCL on anterior and posterior chest X-rays (CXRs). We use three public largescale CXR datasets as the unlabeled pre-training data to simulate the federated dataset size # of classes multi-label multi-view balanced resolution CheXpert [11] 371920 14 390 × 320 ChestX-ray8 [26] environment, namely CheXpert [11], ChestX-ray8 [26], and VinDr-CXR [17] (see Table 1). Three datasets are collected and annotated from different sources independently and express a large variety in data modalities (see Fig.…”
Section: Methodsmentioning
confidence: 99%
“…Computer-aided diagnosis (CAD) systems for identification of lung abnormality in adult CXRs have recently achieved great success thanks to the availability of large labeled datasets [6][7][8][9][10] . Many large-scale CXR datasets of adult patients such as ChestX-ray14 6 , Padchest 7 , CheXpert 8 , MIMIC-CXR 9 and VinDr-CXR 10 have been established and released in recent years. These datasets boosted new advances in exploring new machine learning-based approaches in the interpretation of CXR in adults 8,[11][12][13][14][15][16] .…”
Section: Background and Summarymentioning
confidence: 99%
“…Earlier, NIH Chest X-ray 14 [30] proposed by Wang et al contains 112,120 front-view images of 14 disease categories, among which there are 880 images of 8 categories containing box annotations. Lately, Nguyen et al proposed VinDr-CXR [19] which contains 18,000 images that were manually annotated with 22 classes of rectangles surrounding abnormalities and 6 global labels of suspected diseases. There also exist some datasets that focus on a single disease, such as the Pneumonia detection dataset 1 , Tuberculosis detection dataset [18] and Pneumothorax segmentation dataset 2 , etc.…”
Section: Automatic Chest X-ray Analysismentioning
confidence: 99%