2020
DOI: 10.3390/rs12061034
|View full text |Cite
|
Sign up to set email alerts
|

Accounting for Training Data Error in Machine Learning Applied to Earth Observations

Abstract: Remote sensing, or Earth Observation (EO), is increasingly used to understand Earth system dynamics and create continuous and categorical maps of biophysical properties and land cover, especially based on recent advances in machine learning (ML). ML models typically require large, spatially explicit training datasets to make accurate predictions. Training data (TD) are typically generated by digitizing polygons on high spatial-resolution imagery, by collecting in situ data, or by using pre-existing datasets. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 57 publications
(40 citation statements)
references
References 203 publications
0
37
0
Order By: Relevance
“…It shows that we achieved an accuracy rate of 83% with all variables. A skill value of 0.78 (Table 3a-c) is above the advised numbers suggested by the software developer (see tutorial [48,62]). The skill measure is the difference between the measured accuracy and the accuracy expected by chance.…”
Section: Transition Modelmentioning
confidence: 86%
“…It shows that we achieved an accuracy rate of 83% with all variables. A skill value of 0.78 (Table 3a-c) is above the advised numbers suggested by the software developer (see tutorial [48,62]). The skill measure is the difference between the measured accuracy and the accuracy expected by chance.…”
Section: Transition Modelmentioning
confidence: 86%
“…We must leverage these important advances, while remaining vigilant of the "black box" nature of some algorithms, so that we get the right answers for the right reasons (Kirchner, 2006). As the power of machine learning algorithms is limited by the availability of appropriate training data as well as explicitly addressing the physical processes, a critical problem is how to develop training data for approaches based on multidisciplinary, multisensor remote sensing (Elmes et al, 2020), particularly those that accurately characterize extreme events. Indeed, observational errors in training data can introduce significant bias in the resulting ML model prediction.…”
Section: Bringing Everything Together Through Data Assimilation and Cloud Computingmentioning
confidence: 99%
“…(1) Generating reference data through visual interpolation was the most common method, but it could contain errors and time-consuming (Elmes et al, 2019). (2) Obtaining reference data via the field survey is the most accurate method, but it cannot be conducted in all areas because of the inaccessibility issues.…”
Section: Research Developments and Challenges In Data Collectionmentioning
confidence: 99%