2019
DOI: 10.1101/774604
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

embarcadero: Species distribution modelling with Bayesian additive regression trees in R

Abstract: 8 1. Classification and regression tree methods, like random forests (RF) or 9 boosted regression trees (BRT), are one of the most popular methods of 10 mapping species distributions. 11 12 2. Bayesian additive regression trees (BARTs) are a relatively new alterna-13 tive to other popular regression tree approaches. Whereas BRT iteratively 14 fits an ensemble of trees each explaining smaller fractions of the total vari-15 ance, BART starts by fitting a sum-of-trees model and then uses Bayesian 16 backfitting w… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
17
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3
1

Relationship

2
5

Authors

Journals

citations
Cited by 10 publications
(17 citation statements)
references
References 29 publications
0
17
0
Order By: Relevance
“…There exists a large suite of algorithms for modelling the distribution of species, but because there is no single 'best' algorithm some authors have reasonably concluded that niche or distribution modelling studies should begin by testing a suite of algorithms for predictive ability under the particular circumstances of the study and choose an algorithm for a particular challenge based on the results of those tests (Qiao et al, 2015). Accordingly, we assessed the relative performance of various categories of SDM algorithms: BIOCLIM (Busby, 1991;Booth et al, 2014), Generalized Linear Models (GLMs, Guisan et al, 2002), MaxLike (Royle, et al, 2012), Random forests (Breiman, 2001), Boosted Regression Trees (Elith et al, 2008), Support Vector Machines (SVMs; Vapnik, 1998), and Bayesian additive regression trees (BART, Carlson, 2020).…”
Section: Modelling Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…There exists a large suite of algorithms for modelling the distribution of species, but because there is no single 'best' algorithm some authors have reasonably concluded that niche or distribution modelling studies should begin by testing a suite of algorithms for predictive ability under the particular circumstances of the study and choose an algorithm for a particular challenge based on the results of those tests (Qiao et al, 2015). Accordingly, we assessed the relative performance of various categories of SDM algorithms: BIOCLIM (Busby, 1991;Booth et al, 2014), Generalized Linear Models (GLMs, Guisan et al, 2002), MaxLike (Royle, et al, 2012), Random forests (Breiman, 2001), Boosted Regression Trees (Elith et al, 2008), Support Vector Machines (SVMs; Vapnik, 1998), and Bayesian additive regression trees (BART, Carlson, 2020).…”
Section: Modelling Methodsmentioning
confidence: 99%
“…In computer science, BARTs are used for everything from medical diagnostics to self-driving car algorithms, however they have yet to fi nd widespread application in ecology and in predicting species distributions. Running SDMs with BARTs has recently been greatly facilitated by the development of an R package, 'embarcadero' (Carlson, 2020), including an automated variable selection procedure being highly eff ective at identifying informative subsets of predictors. Also the package includes methods for generating and plotting partial dependence curves.…”
Section: Analysis Of the Environmental Niche Using Bartsmentioning
confidence: 99%
“…SDMs were generated by employing Bayesian additive regression trees (BART), a powerful machine learning approach. Running SDMs with BARTs has recently been greatly facilitated by the development of an R package, 'embarcadero' [13], including an automated variable selection procedure being highly effective at identifying informative subsets of predictors. Also the package includes methods for generating and plotting partial dependence curves and visualization called spatial partial dependence plots, which reclassifies predictor rasters based on their partial dependence plots, and show the relative suitability of different regions for an individual covariate.…”
Section: Methodsmentioning
confidence: 99%
“…In particular, posterior width directly measures model uncertainty (rather than approximating it by permuting training data), and a single model can be run (instead of an ensemble trained on smaller subsets of training data), allowing the model to use the full training dataset all at once. 105…”
Section: Methodsmentioning
confidence: 99%
“…This often produces a much more reduced model without going through a stepwise variable selection process, which can be slow and very subject to stochasticity. 105…”
Section: Methodsmentioning
confidence: 99%