We provide a complete description of possible covariance matrices consistent with a Gaussian latent tree model for any tree. We then present techniques for utilising these constraints to assess whether observed data is compatible with that Gaussian latent tree model. Our method does not require us first to fit such a tree. We demonstrate the usefulness of the inverse-Wishart distribution for performing preliminary assessments of tree-compatibility using semialgebraic constraints. Using results from Drton et al. (2008) we then provide the appropriate moments required for test statistics for assessing adherence to these equality constraints. These are shown to be effective even for small sample sizes and can be easily adjusted to test either the entire model or only certain macrostructures hypothesized within the tree. We illustrate our exploratory tetrad analysis using a linguistic application and our confirmatory tetrad analysis using a biological application.
Evolutionary models of languages are usually considered to take the form of trees. With the development of so-called tree constraints the plausibility of the tree model assumptions can be addressed by checking whether the moments of observed variables lie within regions consistent with trees. In our linguistic application, the data set comprises acoustic samples (audio recordings) from speakers of five Romance languages or dialects. We wish to assess these functional data for compatibility with a hereditary tree model at the language level. A novel combination of canonical function analysis (CFA) with a separable covariance structure provides a method for generating a representative basis for the data. This resulting basis is formed of components which emphasize language differences whilst maintaining the integrity of the observational language-groupings. A previously unexploited Gaussian tree constraint is then applied to component-by-component projections of the data to investigate adherence to an evolutionary tree. The results indicate that while a tree model is unlikely to be suitable for modeling all aspects of the acoustic linguistic data, certain features of the spoken Romance languages highlighted by the separable-CFA basis may indeed be suitably modeled as a tree.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.