John Robinson scite author profile

This paper describes a method of generalized discriminant analysis based on a dissimilarity matrix to test for differences in a priori groups of multivariate observations. Use of classical multidimensional scaling produces a low-dimensional representation of the data for which Euclidean distances approximate the original dissimilarities. The resulting scores are then analysed using discriminant analysis, giving tests based on the canonical correlations. The asymptotic distributions of these statistics under permutations of the observations are shown to be invariant to changes in the distributions of the original variables, unlike the distributions of the multi-response permutation test statistics which have been considered by other workers for testing differences among groups. This canonical method is applied to multivariate fish assemblage data, with Monte Carlo simulations to make power comparisons and to compare theoretical results and empirical distributions. The paper proposes classification based on distances. Error rates are estimated using cross-validation.

show abstract

Permutation Tests for Linear Models

Anderson

Robinson

2001

Aus NZ J of Statistics

373

283

View full text Add to dashboard Cite

Several approximate permutation tests have been proposed for tests of partial regression coefficients in a linear model based on sample partial correlations. This paper begins with an explanation and notation for an exact test. It then compares the distributions of the test statistics under the various permutation methods proposed, and shows that the partial correlations under permutation are asymptotically jointly normal with means 0 and variances 1. The method of Freedman & Lane (1983) is found to have asymptotic correlation 1 with the exact test, and the other methods are found to have smaller correlations with this test. Under local alternatives the critical values of all the approximate permutation tests converge to the same constant, so they all have the same asymptotic power. Simulations demonstrate these theoretical results.

show abstract

The Biasing Effect of Compositional Heterogeneity on Phylogenetic Estimates May be Underestimated

et al. 2004

View full text Add to dashboard Cite

Matched-pairs tests of homogeneity with applications to homologous nucleotide sequences

et al. 2006

View full text Add to dashboard Cite

show abstract

Mixture Models of Nucleotide Sequence Evolution that Account for Heterogeneity in the Substitution Process Across Sites and Across Lineages

Jayaswal

Wong

Robinson

et al. 2014

View full text Add to dashboard Cite

Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear genomes of Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, S. castellii, S. kluyveri, S. bayanus, and Candida albicans. Our results show that second codon sites in the ancestral genome of these species contained 49.1% invariable sites, 39.6% variable sites belonging to one rate category (V1), and 11.3% variable sites belonging to a second rate category (V2). The ancestral nucleotide content was found to differ markedly across these three sets of sites, and the evolutionary processes operating at the variable sites were found to be non-SRH and best modeled by a combination of eight edge-specific rate matrices (four for V1 and four for V2). The number of substitutions per site at the variable sites also differed markedly, with sites belonging to V1 evolving slower than those belonging to V2 along the lineages separating the seven species of Saccharomyces. Finally, sites belonging to V1 appeared to have ceased evolving along the lineages separating S. cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus, implying that ...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

John Robinson

Generalized discriminant analysis based on distances

Permutation Tests for Linear Models

The Biasing Effect of Compositional Heterogeneity on Phylogenetic Estimates May be Underestimated

Matched-pairs tests of homogeneity with applications to homologous nucleotide sequences

Mixture Models of Nucleotide Sequence Evolution that Account for Heterogeneity in the Substitution Process Across Sites and Across Lineages

Contact Info

Product

Resources

About