2023
DOI: 10.1148/ryai.220047
|View full text |Cite
|
Sign up to set email alerts
|

The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.4 Million Screening and Diagnostic Mammographic Images

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 22 publications
(9 citation statements)
references
References 24 publications
0
8
0
Order By: Relevance
“…Manually labeling large data cohorts is impractical; hence, NLP tools allow researchers to build data cohorts by automatically extracting outcomes that serve as labels for training deep learning models at scale [ 50 ]. This has the potential to accelerate the discovery of predictive biomarkers for prognosis, therapeutic response, or the risk of adverse events.…”
Section: Discussionmentioning
confidence: 99%
“…Manually labeling large data cohorts is impractical; hence, NLP tools allow researchers to build data cohorts by automatically extracting outcomes that serve as labels for training deep learning models at scale [ 50 ]. This has the potential to accelerate the discovery of predictive biomarkers for prognosis, therapeutic response, or the risk of adverse events.…”
Section: Discussionmentioning
confidence: 99%
“…FFDM images in raw DICOM format were obtained from five clinical centers in South Korea and database of United States. Five Korean centers include National Cancer Center (NCC), Samsung Seoul Medical Center (SMC), Seoul Asan Medical Center (AMC), Seoul Boramae Medical Center (BMC), and Inje University Ilsan Paik Hospital (IPH), and the Emory Breast Imaging Dataset (EMBED) includes four hospitals mammographic examination data for eight years (Jeong et al, 2023). The images from the Korean Hospitals were acquired during clinical evaluations, thus following a case-control study design; EMBED is a mixture of longitudinal screening data and hospital cases.…”
Section: Data Sourcementioning
confidence: 99%
“…This standardization and established methodology for determining and tracking results has facilitated the creation of multiple large data sets which are a prerequisite for the development of high-performing AI algorithms. There are currently multiple large mammography data sets, some of which contain more than 1 million mammograms with associated patient factors and known clinical outcomes [11][12][13][14]. Many of the available data sets come from various sources including different practice locations, practice types, and multiple mammography vendors.…”
Section: Opportunities In Breast Imaging For Ai Applicationsmentioning
confidence: 99%