The best of both worlds: Phylogenetic eigenvector regression and mapping

Filho, Diniz; Villalobos, Fabricio; Bini, Luis Maurício

doi:10.1590/s1415-475738320140391

Cited by 14 publications

(8 citation statements)

References 26 publications

(56 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…New proposed methods to fill sparse databases currently concerns about their degree of imputation error, that is how much imputed values deviate from the original trait values (Guénard et al 2013; Penone et al 2014; Schrodt et al 2015). We found that single and multiple phylogenetic imputation methods can be highly accurate, resulting in small deviations between imputed and observed values, as suggested by other authors (Guénard et al 2013; Penone et al 2014; Diniz-Filho et al 2015; Schrodt et al 2015). In addition, we found that imputation error was positively correlated with estimation errors but their relationship was not linear.…”

Section: Discussionsupporting

confidence: 84%

Challenging the Raunkiaeran shortfall and the consequences of using imputed databases

Jardim

Bini

Diniz‐Filho

et al. 2016

Preprint

Self Cite

View full text Add to dashboard Cite

14 1. Given the prevalence of missing data on species' traits -Raunkiaeran shorfall 15 and its importance for theoretical and empirical investigations, several 16 methods have been proposed to fill sparse databases. Despite its advantages, 17 imputation of missing data can introduce biases. Here, we evaluate the bias in 18 descriptive statistics, model parameters, and phylogenetic signal estimation from 19 imputed databases under different missing and imputing scenarios. 20 2. We simulated coalescent phylogenies and traits under Brownian Motion and 21 different Ornstein-Uhlenbeck evolutionary models. Missing values were created 22 using three scenarios: missing completely at random, missing at random but 23 phylogenetically structured and missing at random but correlated with some 24 other variable. We considered four methods for handling missing data: delete 25 missing values, imputation based on observed mean trait value, Phylogenetic 26 Eigenvectors Maps and Multiple Imputation by Chained Equations. Finally, we 27 assessed estimation errors of descriptive statistics (mean, variance), regression 28 coefficient, Moran's correlogram and Blomberg's K of imputed traits. 29 3. We found that percentage of missing data, missing mechanisms, Ornstein-30Uhlenbeck strength and handling methods were important to define estimation 31 errors. When data were missing completely at random, descriptive statistics 32 were well estimated but Moran's correlogram and Blomberg's K were not well 33 estimated, depending on handling methods. We also found that handling 34 methods performed worse when data were missing at random, but 35 phylogenetically structured. In this case adding phylogenetic information 36 provided better estimates. Although the error caused by imputation was 37 3 correlated with estimation errors, we found that such relationship is not linear 38 with estimation errors getting larger as the imputation error increases. 39 4. Imputed trait databases could bias ecological and evolutionary analyses. We 40 advise researchers to share their raw data along with their imputed database, 41 flagging imputed data and providing information on the imputation process. 42Thus, users can and should consider the pattern of missing data and then look for 43 the best method to overcome this problem. In addition, we suggest the 44 development of phylogenetic methods that consider imputation uncertainty, 45 phylogenetic autocorrelation and preserve the level of phylogenetic signal of the 46 original data.47 48

show abstract

Section: Discussionsupporting

confidence: 84%

Challenging the Raunkiaeran shortfall and the consequences of using imputed databases

Jardim

Bini

Diniz‐Filho

et al. 2016

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Subsequently, we used principal coordinates analysis (PCoA, Legendre and Legendre 2012) to derive 'taxonomic vectors' describing species taxonomic relatedness. See also Diniz-Filho et al (1998) for a similar and Diniz-Filho et al (2015) for alternative approaches in a true phylogenetic context. Similarly, we calculated trait distances between species using Gower distance with the function 'gowdis' available in the FD R package (ver.…”

Section: Discussionmentioning

confidence: 99%

Ecological niche features override biological traits and taxonomic relatedness as predictors of occupancy and abundance in lake littoral macroinvertebrates

Heino

Tolonen

2018

Ecography

View full text Add to dashboard Cite

The degree to which species ecological and biological traits determine their distribution and abundance has intrigued ecologists for a long time, and it has seen a revival in recent years. This topic is important because it provides information about the determinants of species rarity and their conservation implications. We examined the effects of niche breadth, niche position, biological traits and taxonomic relatedness on the interspecific occupancy–abundance relationship, as well as on occupancy and abundance, in lake littoral macroinvertebrates. We sampled 48 lakes in a boreal lake district, found altogether 155 species, and calculated regional occupancy (as the proportion of sites occupied) and local abundance (as mean abundance at occupied sites) for each species. We determined niche position and niche breadth for each species using the outlying mean index analysis. Also, we calculated trait vectors and taxonomic vectors describing species trait similarity and taxonomic relatedness, respectively, using principal coordinates analysis. We found a strong positive occupancy–abundance relationship that was mostly explained by among‐species variation in niche position, followed by niche breadth. Instead, trait vectors and taxonomic vectors tended to be less important in affecting occupancy and abundance than the niche features. Our results strongly suggest that niche position, a measure of habitat availability for littoral macroinvertebrates, is the chief determinant of their occupancy and abundance. This finding has important implications for ecology and conservation of species, as species with marginal niche position, a reflection of low habitat availability, are both regionally rare and locally uncommon. Such species may face double jeopardy if environmental conditions change and affect their preferred marginal habitat types.

show abstract

“…Finally, phylogenetic eigenvector mapping (PEM) combines both evolutionary dynamics and information on topology (Guénard et al 2013). PEM shares some similarities with PVR and, as such, it was conceived to improve over PVR because it additionally considers underlying evolutionary models (Diniz-Filho et al 2015). In PEM, the topology of the phylogeny is first coded as a binary influence matrix representing ancestor-descendant relationships.…”

Section: Phylogenetic Imputation Methods For Quantitative Traits An mentioning

confidence: 99%

Assessing among‐lineage variability in phylogenetic imputation of functional trait datasets

et al. 2018

View full text Add to dashboard Cite

Phylogenetic imputation has recently emerged as a potentially powerful tool for predicting missing data in functional traits datasets. As such, understanding the limitations of phylogenetic modelling in predicting trait values is critical if we are to use them in subsequent analyses. Previous studies have focused on the relationship between phylogenetic signal and clade‐level prediction accuracy, yet variability in prediction accuracy among individual tips of phylogenies remains largely unexplored. Here, we used simulations of trait evolution along the branches of phylogenetic trees to show how the accuracy of phylogenetic imputations is influenced by the combined effects of 1) the amount of phylogenetic signal in the traits and 2) the branch length of the tips to be imputed. Specifically, we conducted cross‐validation trials to estimate the variability in prediction accuracy among individual tips on the phylogenies (hereafter ‘tip‐level accuracy’). We found that under a Brownian motion model of evolution (BM, Pagel't λ = 1), tip‐level accuracy rapidly decreased with increasing tip branch‐lengths, and only tips of approximately 10% or less of the total height of the trees showed consistently accurate predictions (i.e. cross‐validation R‐squared >0.75). When phylogenetic signal was weak, the effect of tip branch‐length was reduced, becoming negligible for traits simulated with λ < 0.7, where accuracy was in any case low. Our study shows that variability in prediction accuracy among individual tips of the phylogeny should be considered when evaluating the reliability of phylogenetically imputed trait values. To address this challenge, we describe a Monte Carlo‐based method that allows one to estimate the expected tip‐level accuracy of phylogenetic predictions for continuous traits. Our approach identifies gaps in functional trait datasets for which phylogenetic imputation performs poorly, and will help ecologists to design more efficient trait collection campaigns by focusing resources on lineages whose trait values are more uncertain.

show abstract

The best of both worlds: Phylogenetic eigenvector regression and mapping

Cited by 14 publications

References 26 publications

Challenging the Raunkiaeran shortfall and the consequences of using imputed databases

Challenging the Raunkiaeran shortfall and the consequences of using imputed databases

Ecological niche features override biological traits and taxonomic relatedness as predictors of occupancy and abundance in lake littoral macroinvertebrates

Assessing among‐lineage variability in phylogenetic imputation of functional trait datasets

Contact Info

Product

Resources

About