N-mixture models describe count data replicated in time and across sites in terms of abundance N and detectability p. They are popular because they allow inference about N while controlling for factors that influence p without the need for marking animals. Using a capture-recapture perspective, we show that the loss of information that results from not marking animals is critical, making reliable statistical modeling of N and p problematic using just count data. One cannot reliably fit a model in which the detection probabilities are distinct among repeat visits as this model is overspecified. This makes uncontrolled variation in p problematic. By counter example, we show that even if p is constant after adjusting for covariate effects (the "constant p" assumption) scientifically plausible alternative models in which N (or its expectation) is non-identifiable or does not even exist as a parameter, lead to data that are practically indistinguishable from data generated under an N-mixture model. This is particularly the case for sparse data as is commonly seen in applications. We conclude that under the constant p assumption reliable inference is only possible for relative abundance in the absence of questionable and/or untestable assumptions or with better quality data than seen in typical applications. Relative abundance models for counts can be readily fitted using Poisson regression in standard software such as R and are sufficiently flexible to allow controlling for p through the use covariates while simultaneously modeling variation in relative abundance. If users require estimates of absolute abundance, they should collect auxiliary data that help with estimation of p.
N-mixture models provide an appealing alternative to mark-recapture models, in that they allow for estimation of detection probability and population size from count data, without requiring that individual animals be identified. There is, however, a cost to using the N-mixture models: inference is very sensitive to the model's assumptions. We consider the effects of three violations of assumptions that might reasonably be expected in practice: double counting, unmodeled variation in population size over time, and unmodeled variation in detection probability over time. These three examples show that small violations of assumptions can lead to large biases in estimation. The violations of assumptions we consider are not only small qualitatively, but are also small in the sense that they are unlikely to be detected using goodness-of-fit tests. In cases where reliable estimates of population size are needed, we encourage investigators to allocate resources to acquiring additional data, such as recaptures of marked individuals, for estimation of detection probabilities.
Sampling DNA noninvasively has advantages for identifying animals for uses such as mark-recapture modeling that require unique identification of animals in samples. Although it is possible to generate large amounts of data from noninvasive sources of DNA, a challenge is overcoming genotyping errors that can lead to incorrect identification of individuals. A major source of error is allelic dropout, which is failure of DNA amplification at one or more loci. This has the effect of heterozygous individuals being scored as homozygotes at those loci as only one allele is detected. If errors go undetected and the genotypes are naively used in mark-recapture models, significant overestimates of population size can occur. To avoid this it is common to reject low-quality samples but this may lead to the elimination of large amounts of data. It is preferable to retain these low-quality samples as they still contain usable information in the form of partial genotypes. Rather than trying to minimize error or discarding error-prone samples we model dropout in our analysis. We describe a method based on data augmentation that allows us to model data from samples that include uncertain genotypes. Application is illustrated using data from the European badger (Meles meles).
A spatial open‐population capture‐recapture model is described that extends both the non‐spatial open‐population model of Schwarz and Arnason and the spatially explicit closed‐population model of Borchers and Efford. The superpopulation of animals available for detection at some time during a study is conceived as a two‐dimensional Poisson point process. Individual probabilities of birth and death follow the conventional open‐population model. Movement between sampling times may be modeled with a dispersal kernel using a recursive Markovian algorithm. Observations arise from distance‐dependent sampling at an array of detectors. As in the closed‐population spatial model, the observed data likelihood relies on integration over the unknown animal locations; maximization of this likelihood yields estimates of the birth, death, movement, and detection parameters. The models were fitted to data from a live‐trapping study of brushtail possums (Trichosurus vulpecula) in New Zealand. Simulations confirmed that spatial modeling can greatly reduce the bias of capture‐recapture survival estimates and that there is a degree of robustness to misspecification of the dispersal kernel. An R package is available that includes various extensions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.