2020
DOI: 10.7554/elife.54507
|View full text |Cite
|
Sign up to set email alerts
|

Predicting geographic location from genetic variation with deep neural networks

Abstract: Most organisms are more closely related to nearby than distant members of their species, creating spatial autocorrelations in genetic data. This allows us to predict the location of origin of a genetic sample by comparing it to a set of samples of known geographic origin. Here, we describe a deep learning method, which we call Locator, to accomplish this task faster and more accurately than existing approaches. In simulations, Locator infers sample location to within 4.1 generations of dispersal and runs at le… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
82
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 74 publications
(84 citation statements)
references
References 53 publications
2
82
0
Order By: Relevance
“…Despite high regional gene flow, we managed to locate the source of 2-3 of the 4 incursions below the cordon sanitaire. Considering the absence of discrete populations within the TSI, the genotype-based method of Locator (Battey et al, 2020a) was better suited to this dataset than the allele-frequency based method of assignPOP (Chen et al, 2018), though the latter still performed well for Inc-1 and Inc-2 and both methods successfully traced samples collected 10 months after the initial collections. Although Locator may best be suited to non-clustered sampling designs that are rare in island systems, Locator may sidestep issues relating to invasion age and frequent gene flow by not requiring that reference genotypes be sorted into populations.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Despite high regional gene flow, we managed to locate the source of 2-3 of the 4 incursions below the cordon sanitaire. Considering the absence of discrete populations within the TSI, the genotype-based method of Locator (Battey et al, 2020a) was better suited to this dataset than the allele-frequency based method of assignPOP (Chen et al, 2018), though the latter still performed well for Inc-1 and Inc-2 and both methods successfully traced samples collected 10 months after the initial collections. Although Locator may best be suited to non-clustered sampling designs that are rare in island systems, Locator may sidestep issues relating to invasion age and frequent gene flow by not requiring that reference genotypes be sorted into populations.…”
Section: Discussionmentioning
confidence: 99%
“…To estimate source locations of the four incursive Ae. albopictus, we used two complementary methods in the programs assignPOP (Chen et al, 2018) and Locator (Battey et al, 2020a). assignPOP treats each village as a population, and generates assignment probabilities to each hypothetical population using a Monte Carlo assignment test with a support vector machine predictive model.…”
Section: Incursions Past the Cordon Sanitairementioning
confidence: 99%
See 2 more Smart Citations
“…In general, pairwise sequence divergence was lower in Galápagos-Ecuador comparisons (average dxy = 2.3 x 10 3 ) than between Galápagos-Peru comparisons (average dxy = 3.6 x 10 3 ; Figure 2C), and samples showing low genome-wide divergence were clustered in central Ecuador (similar patterns were observed using FST; SI Appendix, Table S10). To investigate potential source localities for invasive populations at a finer scale, we implemented the software Locator (Battey et al, 2020) which uses a machine learning algorithm to predict sample origins from genotype data. Locator predictions indicated 2 to 3 source regions for Galápagos PIM, although the exact locations varied across runs and depended on which island PIM collections were considered (SI Appendix, Fig.…”
Section: Genetic Data Support An Ecuadorian Origin For Most Invasive mentioning
confidence: 99%