2021
DOI: 10.1111/2041-210x.13594
|View full text |Cite
|
Sign up to set email alerts
|

Imputation of incomplete large‐scale monitoring count data via penalized estimation

Abstract: 1. In biodiversity monitoring, large datasets are becoming more and more widely available and are increasingly used globally to estimate species trends and conservation status. These large-scale datasets challenge existing statistical analysis methods, many of which are not adapted to their size, incompleteness and heterogeneity. The development of scalable methods to impute missing data in incomplete large-scale monitoring datasets is crucial to balance sampling in time or space and thus better inform conserv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 44 publications
0
2
0
Order By: Relevance
“…Early approaches used chain indices or route regression (Ter Braak et al, 1992) or the Underhill index, using an expectation-maximisation algorithm (Underhill & Prysjones, 1994) designed for waterbirds (Rehfisch et al, 2003). A range of further model-based approaches have been developed that fill data gaps using mean effects of site and year, e.g., to fill annual gaps using TRIM/birdSTATs, commonly used for bird indices (Lehikoinen et al, 2016); or using splines e.g., to fill seasonal gaps in butterfly data (Schmucki et al, 2016;Dennis et al, 2016) or using ecological covariates (Dakki et al, 2021). A Bayesian framework is especially useful for dealing with missing values in the response since they are naturally imputed with a full probability distribution during model fitting.…”
Section: ) Imputationmentioning
confidence: 99%
“…Early approaches used chain indices or route regression (Ter Braak et al, 1992) or the Underhill index, using an expectation-maximisation algorithm (Underhill & Prysjones, 1994) designed for waterbirds (Rehfisch et al, 2003). A range of further model-based approaches have been developed that fill data gaps using mean effects of site and year, e.g., to fill annual gaps using TRIM/birdSTATs, commonly used for bird indices (Lehikoinen et al, 2016); or using splines e.g., to fill seasonal gaps in butterfly data (Schmucki et al, 2016;Dennis et al, 2016) or using ecological covariates (Dakki et al, 2021). A Bayesian framework is especially useful for dealing with missing values in the response since they are naturally imputed with a full probability distribution during model fitting.…”
Section: ) Imputationmentioning
confidence: 99%
“…In previous work aiming to evaluate prediction methods for biodiversity data [2], authors of this report have faced the challenge of designing fair evaluation procedures, in order not to bias the conclusions, and thus provide sound and reliable recommendations for practitioners. Indeed, designing robust cross-validation (CV) evaluation metrics proved difficult, with different approaches yielding different results.…”
Section: Objectives Of the Reportmentioning
confidence: 99%
“…LORI The function lori of the package lori [2] uses an imputation method whose procedure resembles that of glmnet. The main difference is that, in this model, there are interaction parameters.…”
Section: Gradient Boostingmentioning
confidence: 99%
“…Yet, designed for monitoring schemes with almost-yearly surveys per site by the same observer, TRIM reaches its limits with substantial turnover in survey sites between years leading to a fraction of missing values exceeding approx. 60 % (van Strien et al 2001, Bogaart et al 2020, Dakki et al 2021). Moreover, TRIM is restricted to categorical covariates, requiring climate or landscape composition covariates to be transformed into categories (Bogaart et al 2020).…”
Section: Introductionmentioning
confidence: 99%