Partial nucleotide sequences for the 5S and 5.8S rRNAs from the dinoflagellate Crypthecodinium cohnii have been determined, using a rapid chemical sequencing method, for the purpose of studying dinoflagellate phylogeny. The 5S RNA sequence shows the most homology (75%) with the 5S sequences of higher animals and the least homology (less than 60%) with prokaryotic sequences. In addition, it lacks certain residues which are highly conserved in prokaryotic molecules but are generally missing in eukaryotes. These findings suggest a distant relationship between dinoflagellates and the prokaryotes. Using two different sequence alignments and several different methods for selecting an optimum phylogenetic tree for selecting an optimum phylogenetic tree for a collection of 5S sequences including higher plants and animals, fungi, and bacteria in addition to the C. cohnii sequence, the dinoflagellate lineage was joined to the tree at the point of the plant-animal divergence well above the branching point of the fungi. This result is of interest because it implies that the well-documented absence in dinoflagellates of histones and the typical nucleosomal subunit structure of eukaryotic chromatin is the result of secondary loss, and not an indication of an extremely primitive state, as was previously suggested. Computer simulations of 5S RNA evolution have been carried out in order to demonstrate that the above-mentioned phylogenetic placement is not likely to be the result of random sequence convergence. We have also constructed a phylogeny for 5.8S RNA sequences in which plants, animals, fungi and the dinoflagellates are again represented. While the order of branching on this tree is the same as in the 5S tree for the organisms represented, because it lacks prokaryotes, the 5.8S tree cannot be considered a strong independent confirmation of the 5S result. Moreover, 5.8S RNA appears to have experienced very different rates of evolution in different lineages indicating that it may not be the best indicator of evolutionary relationships. We have also considered the existing biological data regarding dinoflagellate evolution in relation to our molecular phylogenetic evidence.
Evolutionary trees are usually calculated from comparisons of protein or nucleic acid sequences. from present-day organisms by use of algorithms that use only the difference matrix, where the difference matrix is constructed from the sequence differences between pairs of sequences from the organisms. The difference matrix alone cannot.define uniquely the correct position of the ancestor of the present-day organisms (root of the tree). Furthermore, methods using the difference matrix alone often fail to give the correct pattern of tree branching (topology) when the different sequences evolve at different rates. Only for equal rates of evolution can the difference matrix (when used with the so-called matrix method) yield exactly the correct topology and root. In this paper we present a method for calculating evolutionary trees from sequence data that uses, along with the difference matrix, the rate of evolution of the various sequences from their common ancestor. It is proven analytically that this method uniquely determines both the correct tree topology and root in theory for unequal rates of sequence evolution. How one would estimate an ancestral sequence to be used in the method is discussed in particular fox the 5S RNA sequences from prokaryotes and eukaryotes and for ferredoxin sequences. The recent proliferation of protein and nucleic acid sequences, due in part to the development of new sequencing techniques (1, 2), has led to a renewed interest in the evolution of these sequences and the organisms from which they came. We wish to discuss here some problems with the commonly used method (3, 4) (the matrix method) for calculating evolutionary trees from molecular sequence data and to present an alternative method that, in theory, eliminates these problems. Because the analytic proof of our method depends on a knowledge of existing methods, we will briefly review what is necessary from the existing literature, then discuss the problems with the matrix method in some detail.An evolutionary tree representation of nucleic acid or protein sequence differences among several organisms should give the following information: (I) the correct pattern of branching of the various present-day organisms from one another (that is, the correct tree "topology"); (ii) the correct position on the tree of the common ancestor to all the present-day organisms (that is, the correct tree "root"); and (iii) of secondary importance, a reasonable estimate of the number of mutations along each of the tree branches (branch mutations).One type of such a representation is presented in Fig. la for five hypothetical present-day organisms, A, B, C, D, and E, and their common ancestor X. In this representation, the topology is shown by the order in which the branches connect: C and D are most closely related in the sense that they connect first, then CD connects to E, and A connects to B. The root of the tree with the common ancestor X is shown to be between the groups AB and CDE. This pictorial representation may also be represented...
Three new methods for constructing evolutionary trees from molecular sequence data are presented. These methods are based on a theory for correcting for non-constant evolutionary rates (Klotz et al. 1979; Klotz and Blanken 1981). Extensive computer simulations were run to compare these new methods to the commonly used criteria of Dayhoff (1978) and Fitch and Margoliash (1967). The results of these simulations showed that two of the new methods performed as well as Dayhoff's criterion, significantly better than that of Fitch and Margoliash, and as well as a simple variation of the latter (Prager and Wilson 1978) where any topology containing negative branch mutations is discarded. However, no method yielded the correct topology all of the time, which demonstrated the need to determine confidence estimates in a particular result when evolutionary trees are determined from sequence data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.