We describe a new classifier for protein secondary structure prediction that is formed by cascading together different types of classifiers using neural networks and linear discrimination. The new classifier achieves an accuracy of 76.7% assessed by a rigorous full Jack-knife procedure! on a new nonredundant dataset of 496 nonhomologous sequences obtained from G.J. Barton and J.A. Cuff !. This database was especially designed to train and test protein secondary structure prediction methods, and it uses a more stringent definition of homologous sequence than in previous studies. We show that it is possible to design classifiers that can highly discriminate the three classes~H, E, C! with an accuracy of up to 78% for b-strands, using only a local window and resampling techniques. This indicates that the importance of long-range interactions for the prediction of b-strands has been probably previously overestimated.
Styrylquinoline derivatives, known to be potent inhibitors of HIV-1 integrase, have been experimentally tested for their inhibitory effect on the disintegration reaction catalyzed by catalytic cores of HIV-1 and Rous sarcoma virus (RSV) integrases. A modified docking protocol, consisting of coupling a grid search method with full energy minimization, has been specially designed to study the interaction between the inhibitors and the integrases. The inhibitors consist of two moieties that have hydroxyl and/or carboxyl substituents: the first moiety is either benzene, phenol, catechol, resorcinol, or salicycilic acid; the hydroxyl substituents on the second (quinoline) moiety may be in the keto or in the enol forms. Several tautomeric forms of the drugs have been docked to the crystallographic structure of the RSV catalytic core. The computed binding energy of the keto forms correlates best with the measured inhibitory effect. The docking procedure shows that the inhibitors bind closely to the crystallographic catalytic Mg(2+) dication. Additional quantum chemistry computations show that there is no direct correlation between the binding energy of the drugs with the Mg(2+) dication and their in vitro inhibitory effect. The designed method is a leading way for identification of potent integrase inhibitors using in silico experiments.
Recent experimental studies of the structure of triple helices show that their conformation in solution differs from the A-like structure derived from diffraction data on triple helix fibers by Arnott and co-workers. Here we show by means of molecular modeling that a family of triple helix structures may exist with similar conformational energies, but with a variety of sugar puckers. The characteristics of these putative triple helices are analyzed for three different base sequences: (T.AxT)n, (C.GxC+)n, and alternating (C.GxC+/T.AxT)n. In the case of (C.GxC+)n triple helix, infrared and Raman spectra have been obtained and clearly reveal the existence of both N- and S-type sugars in solution. The molecular mechanics calculations allow us to propose a stereochemically reasonable model for this triple helix, in good agreement with the vibrational spectroscopy results.
In this work we selected double-stranded DNA sequences capable of forming stable triplexes at 20 or 50 degrees C with corresponding 13mer purine oligonucleotides. This selection was obtained by a double aptamer approach where both the starting sequences of the oligonucleotides and the target DNA duplex were random. The results of selection were confirmed by a cold exchange method and the influence of the position of a 'mismatch' on the stability of the triplex was documented in several cases. The selected sequences obey two rules: (i) they have a high G content; (ii) for a given G content the stability of the resulting triplex is higher if the G residues lie in stretches. The computer simulation of the Mg2+, Na+and Cl-environment around three triplexes by a density scaled Monte Carlo method provides an interpretation of the experimental observations. The Mg2+cations are statistically close to the G N7 and relatively far from the A N7. The presence of an A repels the Mg2+from adjacent G residues. Therefore, the triplexes are stabilized when the Mg2+can form a continuous spine on G N7.
The structures of triple helices alpha dT6.beta dAn.beta dTn, alpha dT12.beta dAn.beta dTn, alpha dC12+.beta dGn.beta dCn, and alpha dC12+.beta rGn.beta rCn have been studied by Fourier transform infrared spectroscopy, Raman spectroscopy, and molecular mechanics calculations. The sugar conformations in these triplexes have been determined by vibrational spectroscopy. Our results show the existence of only S-type sugars in the alpha dT12.beta dAn.beta dTn triple helix. Both S- and N-type sugar infrared and Raman markers have been detected in the spectra of alpha dC12+.beta dGn.beta dCn. Molecular mechanics refinements taking into account vibrational spectroscopy data constraints allow us to propose third strand hydrogen-bonding schemes and third strand polarities in triple helix models. For alpha dT12.beta dA12.beta dT12 the third strand forms reverse Hoogsteen hydrogen bonds with the beta dA12 strand and therefore is parallel to the purine strand. In contrast, for alpha dC12+.beta dG12.beta dC12 calculations show that only a model in which the third strand is Hoogsteen base paired and antiparallel to the purine strand of the Watson-Crick duplex is compatible with spectroscopic data.
Intramolecular triple helices have been obtained by folding back twice oligonucleotides formed by decamers bound by non-nucleotide linkers: dA10-linker-dA10-linker-dT10 and dA10-linker-dT10-linker-dA10. We have thus prepared two triple helices with forced third strand orientation, respectively antiparallel (apA*A-T) and parallel (pA*A-T) with respect to the adenosine strand of the Watson-Crick duplex. The existence of the triple helices has been shown by FTIR, UV and fluorescence spectroscopies. Similar melting temperatures have been obtained in very different oligomer concentration conditions (micromolar solutions for thermal denaturation classically followed by UV spectroscopy, milimolar solutions in the case of melting monitored by FTIR spectroscopy) showing that the triple helices are intramolecular. The stability of the parallel triplex is found to be slightly lower than that of the antiparallel (deltaT(m) = 6 degrees C). The sugar conformations determined by FTIR are different for both triplexes. Only South-type sugars are found in the antiparallel triplex whereas both South- and North-type sugars are detected in the parallel triplex. In this case, thymidine sugars have a South-type geometry, and the adenosine strand of the Watson-Crick duplex has North-type sugars. For the antiparallel triplex the experimental results and molecular modeling data are consistent with a reverse-Hoogsteen like third-strand base pairing and South-type sugar conformation. An energetically optimized model of the parallel A*A-T triple helix with a non-uniform distribution of sugar conformations is discussed.
We have compared the accuracy of the individual protein secondary structure prediction methods: PHD, DSC, NNSSP and Predator against the accuracy obtained by combing the predictions of the methods. A range of ways of combing predictions were tested: voting, biased voting, linear discrimination, neural networks and decision trees. The combined methods that involve 'learning' (the non-voting methods) were trained using a set of 496 non-homologous domains; this dataset was biased as some of the secondary structure prediction methods had used them for training. We used two independent test sets to compare predictions: the first consisted of 17 non-homologous domains from CASP3 (Third Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction); the second set consisted of 405 domains that were selected in the same way as the training set, and were non-homologous to each other and the training set. On both test datasets the most accurate individual method was NNSSP, then PHD, DSC and the least accurate was Predator; however, it was not possible to conclusively show a significant difference between the individual methods. Comparing the accuracy of the single methods with that obtained by combing predictions it was found that it was better to use a combination of predictions. On both test datasets it was possible to obtain a approximately 3% improvement in accuracy by combing predictions. In most cases the combined methods were statistically significantly better (at P = 0.05 on the CASP3 test set, and P = 0.01 on the EBI test set). On the CASP3 test dataset there was no significant difference in accuracy between any of the combined method of prediction: on the EBI test dataset, linear discrimination and neural networks significantly outperformed voting techniques. We conclude that it is better to combine predictions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.