2020
DOI: 10.14778/3415478.3415562
|View full text |Cite
|
Sign up to set email alerts
|

Data collection and quality challenges for deep learning

Abstract: Software 2.0 is a fundamental shift in software engineering where machine learning becomes the new software, powered by big data and computing infrastructure. As a result, software engineering needs to be re-thought where data becomes a first-class citizen on par with code. One striking observation is that 80-90% of the machine learning process is spent on data preparation. Without good data, even the best machine learning algorithms cannot perform well. As a result, data-centric AI practices are now becoming … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 74 publications
(27 citation statements)
references
References 55 publications
(85 reference statements)
0
19
0
Order By: Relevance
“…There is some guidance 71 on discretion referring to IMDRF policy. 72 Thus, the use of the MLMD application in a "safe" context may support a lower risk classification. Yet many MLMD devices may end up as high-risk devices that will be covered by mandatory regulation.…”
Section: The Institute Of Electrical and Electronics Engineers (Ieee)mentioning
confidence: 99%
See 2 more Smart Citations
“…There is some guidance 71 on discretion referring to IMDRF policy. 72 Thus, the use of the MLMD application in a "safe" context may support a lower risk classification. Yet many MLMD devices may end up as high-risk devices that will be covered by mandatory regulation.…”
Section: The Institute Of Electrical and Electronics Engineers (Ieee)mentioning
confidence: 99%
“…71 https://ec.europa.eu/health/system/files/2020-09/md_mdcg_2019_11_ guidance_qualification_classification_software_en_0.pdf accessed 27.4.2022. 72 check additional requirements according to "General Features of Medical Device Regulation" section 10. Their questions are reflected by the AI-guideline of the Johner Institute.…”
Section: The Institute Of Electrical and Electronics Engineers (Ieee)mentioning
confidence: 99%
See 1 more Smart Citation
“…This type of study focuses on the analysis of the population involved in the research study (Xie et al 2018). Surveys conducted with the involvement of the cross-sectional study are always descriptive and trends to provide an advantage to the researchers in the smooth conduction of the research (Whang and Lee, 2020). A major focus is to do data collection depending on the different types of situations that may occur.…”
Section: Time Horizonmentioning
confidence: 99%
“…Most of the effort in the application of machine learning algorithms is spent on data preparation [11] . The performance of a machine learning algorithm is directly related to the quality of the data sets employed for the learning task.…”
Section: Data Generation and Managementmentioning
confidence: 99%