2012
DOI: 10.1016/j.ecolmodel.2012.03.007
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating effectiveness of down-sampling for stratified designs and unbalanced prevalence in Random Forest models of tree species distributions in Nevada

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
39
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 61 publications
(39 citation statements)
references
References 28 publications
0
39
0
Order By: Relevance
“…Climatic variables, their interactions and the Heat Load Index were used as predictors. RFs have multiple applications in ecological studies (Cutler et al, 2007), being widely used particularly for the prediction of species distributions (Iverson, 2008;Attorre et al, 2011;Freeman et al, 2012). This method is generally based on fewer assumptions than classical parametrical methods (e.g.…”
Section: Predictive Vegetation Modellingmentioning
confidence: 99%
“…Climatic variables, their interactions and the Heat Load Index were used as predictors. RFs have multiple applications in ecological studies (Cutler et al, 2007), being widely used particularly for the prediction of species distributions (Iverson, 2008;Attorre et al, 2011;Freeman et al, 2012). This method is generally based on fewer assumptions than classical parametrical methods (e.g.…”
Section: Predictive Vegetation Modellingmentioning
confidence: 99%
“…In addition, according to Kernes et al [88] tree models had a better understanding of the relationship and the boundaries, than logistic regression models for predicting the shrub cover spatial shifting. Also, according to [89], RFs are often used in very large geographical areas and when the number of samples in classes is unbalanced, the RF algorithm can be used with an acceptable level of accuracy for classification in such instances and it can be one of the factors for superiority over the SVM and k-NN. Naidoo et al [90] studied the possibility of modelling savanna tree species in the Kruger national park in South Africa, using integrated hyperspectral, light detection and ranging (LiDAR) data and the RF algorithm and their results showed that the RF model produced 87% accuracy.…”
Section: Discussionmentioning
confidence: 99%
“…Sample balance is also important in 130 training data selection, as unbalanced training data (i.e., dramatically more or less training data for one or 131 multiple classes) may result in rare land cover types being under-represented relative to more abundant 132 classes, which may degrade the overall classification accuracy (Weiss & Provost, 2003;Estabrooks et al, 133 2004;Mellor et al, 2015). Techniques such as down-sampling of majority classes (Freeman et al, 2012), 134 over-sampling of minority classes (Ling and Li, 1998), and a combination of over-sampling and down-135 sampling training classes (Chawla et al, 2002) have been explored to alleviate the problem of unbalanced 136 training data. Outliers in the training data may also impact the training process (Radoux et al, 2014) and 137 can be eliminated based on spectral or spatial distances.…”
Section: Introduction 41mentioning
confidence: 99%