We present a method for predicting protein folding class based on global protein chain description and a voting process. Selection of the best descriptors was achieved by a computer-simulated neural network trained on a data base consisting of 83 folding classes. Protein-chain descriptors include overall composition, transition, and distribution of amino acid attributes, such as relative hydrophobicity, predicted secondary structure, and predicted solvent exposure. Cross-validation testing was performed on 15 of the largest classes. The test shows that proteins were assigned to the correct class (correct positive prediction) with an average accuracy of 71.7%, whereas the inverse prediction of proteins as not belonging to a particular class (correct negative prediction) was 90-95% accurate. When tested on 254 structures used in this study, the top two predictions contained the correct class in 91% of the cases.Examination of three-dimensional (3D) structures of proteins determined by x-ray diffraction and NMR has shown that the variety of folding patterns of proteins is significantly restricted (1, 2). Since protein sequence information grows significantly faster than information on protein 3D structure, the need for predicting the folding pattern of a given protein sequence naturally arises. Since the first relatively full classification of folding patterns of globular proteins (3), researchers have developed various schemes for classification of protein 3D structures (4-6) that are essentially based on the same spatial motifs.If the prediction is restricted to a small number of structural classes (less than five), a prediction performance >70% can be easily achieved by using various methods based on a simple representation of sequences as vectors of a small number of general parameters. In the simplest classification, proteins are usually described in terms of the following "tertiary super classes:" all a (proteins have only a-helix secondary structure), all 13 (mainly 3-sheet secondary structure), a+0 (a-helix and {3-strand secondary structure segments that do not mix), a/13 (mixed or alternating segments of a-helical and 13-strand secondary structure), and irregular (7-9). Several statistical methods were developed to predict whether a protein belongs to one of these classes (10)(11)(12)(13)(14)(15)(16)(17). In a recent study on predicting protein structural class (all a, all 1, or composed of a and 1 elements) from amino acid composition and hydrophobic pattern frequency information using computer-simulated neural networks (NNs) and statistical clustering, Metfessel et al.(18) obtained a prediction accuracy of 80.2%. Consideration of specific features of folding classes in the form of so-called hidden Markov models or probabilistic grammars allows a >2-fold increase in the number of classes of recognition (9). This method accurately predicts 12 classes; however, the study gives test results only for 16 sequences.It is obvious that difficulty of folding pattern prediction grows rapidly with the number...
The crystal structure of the RNA dodecamer duplex (r-GGACUUCGGUCC)2 has been determined. The dodecamers stack end-to-end in the crystal, simulating infinite A-form helices with only a break in the phosphodiester chain. These infinite helices are held together in the crystal by hydrogen bonding between ribose hydroxyl groups and a variety of donors and acceptors. The four noncomplementary nucleotides in the middle of the sequence did not form an internal loop, but rather a highly regular double-helix incorporating the non-Watson-Crick base pairs, G.U and U.C. This is the first direct observation of a U.C (or T.C) base pair in a crystal structure. The U.C pairs each form only a single base-base hydrogen bond, but are stabilized by a water molecule which bridges between the ring nitrogens and by four waters in the major groove which link the bases and phosphates. The lack of distortion introduced in the double helix by the U.C mismatch may explain its low efficiency of repair in DNA. The G.U wobble pair is also stabilized by a minor-groove water which bridges between the unpaired guanine amino and the ribose hydroxyl of the uracil. This structure emphasizes the importance of specific hydrogen bonding between not only the nucleotide bases, but also the ribose hydroxyls, phosphate oxygens and tightly bound waters in stabilization of the intramolecular and intermolecular structures of double helical RNA.
The x-ray crystal structure of a 417-nt ribonuclease P RNA from Bacillus stearothermophilus was solved to 3.3-Å resolution. This RNA enzyme is constructed from a number of coaxially stacked helical domains joined together by local and long-range interactions. These helical domains are arranged to form a remarkably flat surface, which is implicated by a wealth of biochemical data in the binding and cleavage of the precursors of transfer RNA substrate. Previous photoaffinity crosslinking data are used to position the substrate on the crystal structure and to identify the chemically active site of the ribozyme. This site is located in a highly conserved core structure formed by intricately interlaced long-range interactions between interhelical sequences.ribozyme ͉ RNA crystallography ͉ tRNA processing R Nase P catalyzes hydrolysis of a phosphodiester bond in precursors of transfer RNA (tRNA) to form the 5Ј-phosphorylated mature tRNA with the release of a 5Ј-precursor fragment (1, 2). RNase P homologs occur in all organisms, and the cellular RNase P always is a ribonucleoprotein that consists of one large RNA and one or more protein component. In bacteria, RNase P is typically comprised of a 350-to 400-nt RNA and one Ϸ120-aa basic protein. Although both RNA and protein components are necessary for cell viability, in vitro at high salt concentrations, bacterial RNase P RNA can act as a catalyst independently of protein (3). Bacterial RNase P is a ribozyme, an RNA-based enzyme.Knowledge of the structure of RNase P RNA is essential for understanding its function, and structure has been the focus of numerous studies of the RNA. Phylogenetic comparative analyses of RNase P RNA sequences have established the secondary and some tertiary structure of the RNA in a broad diversity of organisms (4-8). Photochemical crosslinking studies provided structural information to orient the helical elements and identified nucleotides associated with the active site of the RNA (9, 10). There are two major structural types of bacterial RNase P RNA, A (ancestral) and B (Bacillus), which differ in a number of structural elements attached to a homologous conserved structure. About two-thirds of any bacterial RNase P RNA is shown by sequence covariations to be involved in Watson-Crick base-pairing interactions, but the interactions that form the global structure have been speculative.To gain a better understanding of bacterial RNase P, we crystallized and solved the structure of a 417-nt B-type RNase P RNA from Bacillus stearothermophilus, a moderately thermophilic, low GϩC Gram-positive bacterium. Although the structure does not yet explain the chemical mechanism of catalysis, it is in agreement with a wealth of available biochemical and comparative data, and it provides a structural context for the chemically active site of this ribozyme. Materials and MethodsRNA Purification, Crystallization, and Data Collection. As detailed in supporting information, which is published on the PNAS web site, RNA was transcribed in vitro with T7 phage RNA ...
Many carcinogenic as well as chemotherapeutic agents cause covalent linkages between complementary strands of DNA. If unrepaired, DNA crosslinks are blocks to DNA replication and transcription and therefore represent potentially lethal lesions to the cell. Genetic studies of Escherichia coli have demonstrated that the repair enzyme ABC excision nuclease, coded for by the three unlinked genes, uvrA, uvrB, and uvrC, plays a crucial role in DNA crosslink repair. To study the molecular events of ABC excision nudeasemediated crosslink repair, we have engineered a DNA fragment with a psoralen-DNA interstrand crosslink at a defined position, digested this substrate with pure enzyme, and analyzed the reaction products on DNA sequencing gels. We find that the excision nuclease (i) cuts only one of the two strands involved in the crosslink, (ii) incises the crosslink by hydrolyzing the ninth phosphodiester bond 5' and the third phosphodiester bond 3' to the furan-side thymine of the crosslink, and (iii) does not produce double-strand breaks at any significant level. Based on these data, we present a model by which ABC excision nuclease initiates crosslink repair in vivo.Several agents have been shown to produce DNA interstrand crosslinks (1) including mitomycin C (2), nitrous acid (3), nitrogen and sulfur mustards (4), formaldehyde (5), cisplatin (6), and psoralen plus light (7,8). Because oftheir predictable and highly characterized reactivity with DNA (9, 10), psoralens have been used most extensively to study the repair of DNA crosslinks in several organisms (11-17). These three-ring heterocyclic aromatic compounds (furocoumarins) contain two reactive double bonds that, upon absorption of near UV light (320-360 nm), photoreact with the 5,6 double bond in pyrimidines to form both monoadducts and interstrand crosslinked diadducts, primarily at 5' TpA 3' and to a lesser extent at 5' ApT 3' sequences (18).Genetic and biochemical studies of Escherichia coli have identified several proteins necessary for the removal of psoralen crosslinks, including products of the recA, uvrA, uvrB, uvrC, uvrD, and polA genes (12,(19)(20)(21)(22)(23)(24)(25)(26). Furthermore, it has been estimated that in the uvrA recA double mutant, one DNA crosslink per genome is lethal (24,25). Based on these data, several models of psoralen crosslink repair have been proposed that incorporate components of both the RecA-dependent recombination and the nucleotide excision repair pathways (12, 23).In E. coli, the incision and excision steps of nucleotide excision repair are mediated by a single enzyme, the ABC excision nuclease (see ref. 27). This enzyme is composed of three proteins UvrA (Mr, 103,874), UvrB (Mr, 76,118), and UvrC (Mr, 66,038), which act in concert to cleave both the eighth phosphodiester bond 5' and the fourth or fifth phosphodiester bond 3' to UV-induced cyclobutane pyrimidine dimers and 6-4 pyrimidine-pyrimidone intrastrand diadducts (28). The goal of this study was to characterize the molecular events of ABC excision nuclease-m...
X-ray crystallagraphic studies studies indicate that there are at least four site-specifically bound hydrated Mg2+ ions, [Mg(H2O)n]2+, in yeast tRNAPhe. The size and the octahedral coordination geometry, rather than the charge, of [Mg(H2O)N]2+ appear to be the primary reasons for the specificity of magnesium ions in site-binding and in the stabilization of the tertiary structure of tRNA.
Abstract. RNAs are modular biomolecules, composed largely of conserved structural subunits, or motifs. These structural motifs comprise the secondary structure of RNA and are knit together via tertiary interactions into a compact, functional, three-dimensional structure and are to be distinguished from motifs defined by sequence or function. A relatively small number of structural motifs are found repeatedly in RNA hairpin and internal loops, and are observed to be composed of a limited number of common 'structural elements '. In addition to secondary and tertiary structure motifs, there are functional motifs specific for certain biological roles and binding motifs that serve to complex metals or other ligands. Research is continuing into the identification and classification of RNA structural motifs and is being initiated to predict motifs from sequence, to trace their phylogenetic relationships and to use them as building blocks in RNA engineering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.