The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance.
A cDNA clone for a physiologically regulated Tetrahymena cysteine protease gene was sequenced. The nucleotide sequence predicts that the clone encodes a 336-amino acid protein composed of a 19-residue N-terminal signal sequence followed by a 107-residue propeptide and a 210-residue mature protein. Comparison of the deduced amino acid sequence of the protein with those of other cysteine proteases revealed a highly conserved interspersed amino acid motif in the propeptide region of the protein, the ERFNIN motif. The motif was present in all of the cysteine proteases in the data base with the exception of the cathepsin B-like proteins, which have shorter propeptides. Differences in the propeptides and in conserved amino acids of the mature proteins suggest that the ERFNIN proteases and the cathepsin B-like proteases constitute two distinct subfamilies within the cysteine proteases.The cysteine proteases are a family of enzymes that play an important role in intracellular protein degradation. These proteases and their cDNA clones have been isolated from phylogenetically diverse organisms ranging from slime mold to mammals. The tertiary structures of two plant cysteine proteases, papain and actinidin, have been solved (1, 2). The enzymes have two protein domains that come together to form the active site. Amino acid sequence homologies suggest this double domain structure is conserved in the animal thiol proteases cathepsins B, H, and L (3).The phylogenetic range of organisms for which the sequence of cysteine protease genes are known was extended by determination of the sequence of a cDNA clone for a gene from a ciliated protozoan, Tetrahymena thermophila.t Comparison of the deduced amino acid sequence to those of known cysteine proteases revealed the presence of an amino acid motif in the propeptide region consisting of highly conserved amino acids interspersed with variable residues. The motif was present in 15 of 20 cysteine proteases in the EMBL/GenBank data base (August 1992). The five proteases that lacked the motif were all cathepsin B-like enzymes. Recognition of the differences in the propeptide region prompted comparison of the mature proteins. Alignment of the amino acid sequences of the proteases as two separate groups allowed identification of amino acids that are highly conserved among the proteases with the propeptide motif or among the cathepsin B-like proteases but strikingly different between the two groups. We suggest that the proteins with the interspersed motif and the cathepsin B-like proteases represent two distinct classes of cysteine proteases that can be distinguished by both propeptide and mature protein structure. MATERIALS AND METHODSTetrahymena clone pCyP (formerly BC11) is a cDNA clone of an RNA that is expressed in starved, but not growing, cells (4, 5). The clone was isolated from a cDNA library of RNA from starved cells cloned into the Pst I site of pUC9 (4). DNA fragments were subcloned into pBluescript for sequencing. The sequence was scanned for open reading frames by usi...
In the ciliated protozoan Tetrahymena thermophila, extensive DNA elimination is associated with differentiation of the somatic macronucleus from the germline micronucleus. This study describes the isolation and complete characterization of Tlr elements, a family of approximately 30 micronuclear DNA sequences that are efficiently eliminated from the developing macronucleus. The data indicate that Tlr elements are comprised of an approximately 22 kb internal region flanked by complex and variable termini. The Tlr internal region is highly conserved among family members and contains 15 open reading frames, some of which resemble genes encoded by transposons and viruses. The Tlr termini appear to be long inverted repeats consisting of (i) a variable region containing multiple direct repeats which differ in number and sequence from element to element and (ii) a conserved terminal 47 bp sequence. Taken together, these results suggest that Tlr elements comprise a novel family of mobile genetic elements that are confined to the Tetrahymena germline genome. Possible mechanisms of developmentally programmed Tlr elimination are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.