Johannes Heisig scite author profile

et al. 2022

This article describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus robur L., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for a total of three million of points was used to train different algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 305 coarse and high resolution covariates representing spectral reflectance, different biophysical conditions and biotic competition was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. Logloss and computing time were used to select the three best algorithms to tune and train an ensemble model based on stacking with a logistic regressor as a meta-learner. An ensemble model was trained for each species: probability and model uncertainty maps of realized distribution were produced for each species using a time window of 4 years for a total of six distribution maps per species, while for potential distributions only one map per species was produced. Results of spatial cross validation show that the ensemble model consistently outperformed or performed as good as the best individual model in both potential and realized distribution tasks, with potential distribution models achieving higher predictive performances (TSS = 0.898, R2logloss = 0.857) than realized distribution ones on average (TSS = 0.874, R2logloss = 0.839). Ensemble models for Q. suber achieved the best performances in both potential (TSS = 0.968, R2logloss = 0.952) and realized (TSS = 0.959, R2logloss = 0.949) distribution, while P. sylvestris (TSS = 0.731, 0.785, R2logloss = 0.585, 0.670, respectively, for potential and realized distribution) and P. nigra (TSS = 0.658, 0.686, R2logloss = 0.623, 0.664) achieved the worst. Importance of predictor variables differed across species and models, with the green band for summer and the Normalized Difference Vegetation Index (NDVI) for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter (BIO17) being the most frequent and important for potential distribution. On average, fine-resolution models outperformed coarse resolution models (250 m) for realized distribution (TSS = +6.5%, R2logloss = +7.5%). The framework shows how combining continuous and consistent Earth Observation time series data with state of the art machine learning can be used to derive dynamic distribution maps. The produced predictions can be used to quantify temporal trends of potential forest degradation and species composition change.

Forest tree species distribution for Europe 2000-2020: mapping potential and realized distributions using spatiotemporal Machine Learning

Bonannella¹,

Hengl²,

et al. 2022

Preprint

This paper describes a data-driven framework based on spatiotemporal machine learning to producedistribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagussylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold,Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus roburL., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for atotal of 3 million of points was used to train different algorithms: random forest, gradient-boosted trees,generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 305 coarseand high resolution covariates representing spectral reflectance, different biophysical conditions and bioticcompetition was used as predictors for realized distributions, while potential distribution was modelled withenvironmental predictors only. Logloss and computing time were used to select the three best algorithms totune and train an ensemble model based on stacking with a logistic regressor as a meta-learner. An ensemblemodel was trained for each species: probability and model uncertainty maps of realized distribution wereproduced for each species using a time window of 4 years for a total of 6 distribution maps per species, whilefor potential distributions only one map per species was produced. Results of spatial cross validation showthat the ensemble model consistently outperformed or performed as good as the best individual model inboth potential and realized distribution tasks, with potential distribution models achieving higher predictiveperformances (TSS = 0.898, R2logloss = 0.857) than realized distribution ones on average (TSS = 0.874,R2logloss = 0.839). Ensemble models for Q. suber achieved the best performances in both potential (TSS =0.968, R2logloss = 0.952) and realized (TSS = 0.959, R2logloss = 0.949) distribution, while P. sylvestris (TSS= 0.731, 0.785, R2logloss = 0.585, 0.670, respectively, for potential and realized distribution) and P. nigra(TSS = 0.658, 0.686, R2logloss = 0.623, 0.664) achieved the worst. Importance of predictor variables differedacross species and models, with the green band for summer and the Normalized Difference Vegetation Index(NDVI) for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter(BIO17) being the most frequent and important for potential distribution. On average, fine-resolutionmodels outperformed coarse resolution models (250 m) for realized distribution (TSS = +6.5%, R2logloss =+7.5%). The framework shows how combining continuous and consistent Earth Observation time seriesdata with state of the art machine learning can be used to derive dynamic distribution maps. The producedpredictions can be used to quantify temporal trends of potential forest degradation and species compositionchange.

Forest tree species distribution for Europe 2000-2020: mapping potential and realized distributions using spatiotemporal Machine Learning

Bonannella¹,

Hengl²,

et al. 2022

Preprint

Paper describes a data-driven framework based on spatio-temporal ensemble machine learning to produce distribution maps for 16 forest tree species (Abies alba Mill., Castanea sativa Mill. , Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus robur L., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for a total of 3 million of points was used to train different Machine Learning (ML) algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 585 coarse and high resolution covariates representing spectral reflectance (Landsat bands, spectral indices; time-series of seasonal composites), different biophysical conditions (i.e. temperature, precipitation, elevation, lithology) and biotic competition (other species distribution maps) was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. Logloss and computing time were used to select the three best algorithms to train an ensemble model based on stacking with a logistic regressor as a meta-learner for each species. High resolution (30 m) probability and model uncertainty maps of realized distribution were produced for each species using a time window of 4 years for a total of 6 distribution maps per species for the studied period, while for potential distributions only one map per species was produced. Results of spatial cross validation show that Olea europaea and Quercus suber achieved the best performances in both potential and realized distribution, while Pinus sylvestris and Salix caprea achieved the worst. Further analysis shows that fine-resolution models consistently outperformed coarse resolution models (250 m) for realized distribution (average decrease in logloss: +53%). Realized distribution models achieved higher predictive performances than potential distribution ones. Importance of predictor variables differed across species and models, with the green band for summer and the NDWI and NDVI for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter being the most important and frequent for potential distribution. The ensemble model outperformed or performed as good as the best individual model in all potential species distributions, while for ten species it performed worse than the best individual model in modeling realized distributions. The framework shows how combining continuous and consistent EO time series data with state of the art ML can be used to derive dynamic distribution maps. The produced time-series occurrence predictions can be used to quantify temporal trends and detect potential forest degradation.

Predicting Wildfire Fuels and Hazard in a Central European Temperate Forest Using Active and Passive Remote Sensing

Olson

Pebesma

2022

Fire

Climate change causes more extreme droughts and heat waves in Central Europe, affecting vegetative fuels and altering the local fire regime. Wildfire is projected to expand into the temperate zone, a region traditionally not concerned by fire. To mitigate this new threat, local forest management will require spatial fire hazard information. We present a holistic and comprehensible workflow for quantifying fuels and wildfire hazard through fire spread simulations. Surface and canopy fuels characteristics were sampled in a small managed temperate forest in Northern Germany. Custom fuel models were created for each dominant species (Pinus sylvestris, Fagus sylvatica, and Quercus rubra). Canopy cover, canopy height, and crown base height were directly derived from airborne LiDAR point clouds. Surface fuel types and crown bulk density (CBD) were predicted using random forest and ridge regression, respectively. Modeling was supported by 119 predictors extracted from LiDAR, Sentinel-1, and Sentinel-2 data. We simulated fire spread from random ignitions, considering eight environmental scenarios to calculate fire behavior and hazard. Fuel type classification scored an overall accuracy of 0.971 (Kappa = 0.967), whereas CBD regression performed notably weaker (RMSE = 0.069; R2 = 0.73). Higher fire hazard was identified for strong winds, low fuel moisture, and on slopes. Fires burned fastest and most frequently on slopes in large homogeneous pine stands. These should be the focus of preventive management actions.

Detecting drought effects on tree mortality in forests of Franconia (Germany)

Samimi

2020

Preprint

Central European forests face challenges with climate changing much faster than they can adapt. Extremely hot and dry summers like in 2018 deprive forests of soil moisture, leaving them with low ground water levels. While individuals with deep and well-established root systems survive, young individuals and shallow-rooted species perish.In southern Germany, die-off of single trees or small groups got noticeable recently. Such effects of harsher conditions rarely occur over large areas, but more in a spotted, irregular manner. This makes the phenomenon difficult to detect and to estimate its extent. The share of trees lately deteriorated may be larger than expected and represent a considerable portion of forests. Therefore, we see the great need for monitoring. Remote sensing data is suitable to examine inaccessible areas at a large scale. To quantify mortality of individual trees among a majority of vital ones, sensor platforms and respective data have to fulfill certain criteria regarding spatial, temporal and spectral resolution. Dead trees can be distinguished from others due to discoloration and defoliation. This change in appearance affects the spectral response, even in pixels larger than the tree&#8217;s extent.This study aims at recommending a suitable spatial scale for space-borne multispectral imagery products to achieve this task. We evaluate commercial and free remote sensing data products and their ability to estimate fractional cover of dead vegetation. Satellite data employed in this study comes from Landsat 8 (30 m), Sentinel-2 (10 m), RapidEye (6.5 m) and PlanetScope (3 m). Classification performance is tested against high-resolution multispectral aerial imagery (17 cm) acquired with a Micasense RedEdge-M camera.High-resolution Micasense images are capable of detecting single dead trees, even after downgrading the resolution from 17 cm to 3 m. For all data products tested, fraction of dead trees per pixel did not differ significantly among land cover types (dead vegetation, vital vegetation, pavement, open soil). This indicates that individual dead trees may not be detectable in vital forest stands. The finding even seems to be valid for a resolution of 3 m (PlanetScope), which is identical to the downgraded Micasense data. In the near future the detection of this phenomenon might profit from technical developments towards even higher spatial detail of space-borne sensors. Alternatively, high resolution images from aerial campaigns, manned or unmanned, could bridge this gap when flight time and spatial coverage are increased significantly and facilitating policies are in place.