This work provides a systematic comparison of vibrational CD (VCD) and electronic CD (ECD) methods for spectral prediction of secondary structure. The VCD and ECD data are simplified to a small set of spectral parameters using the principal component method of factor analysis (PC/FA). Regression fits of these parameters are made to the X-ray-determined fractional components (FC) of secondary structure. Predictive capability is determined by computing structures for proteins sequentially left out of the regression. All possible combinations of PC/FA spectral parameters (coefficients) were used to form a full set of restricted multiple regressions with the FC values, both independently for each spectral data set as well as for the two VCD sets and all the data grouped together. The complete search over all possible combinations of spectral parameters for different types of spectral data is a new feature of this study, and the focus on prediction is the strength of this approach. The PC/FA method was found to be stable in detail to expansion of the training set. Coupling amide I1 to amide I' parameters reduced the standard deviations of the VCD regression relationships, and combining VCD and ECD data led to the best fits. Prediction results had a minimum error when dependent on relatively few spectral coefficients. Such a limited dependence on spectral variation is the key finding of this work, which has ramifications for previous studies as well as suggests future directions for spectral analysis of structure. The best ECD prediction for helix and sheet uses only one parameter, the coefficient of the first subspectrum. With VCD, the best predictions sample coefficients of both the amide I' and I1 bands, but error is optimized using only a few coefficients. In this respect, ECD is more accurate than VCD for a-helix, and the combined VCD (amide I'+II) predicts the P-sheet component better than does ECD. Combining VCD and ECD data sets yields exceptionally good predictions by utilizing the strengths of each. However, the residual error, its distribution, and, most importantly, the lack of dependence of the method on many of the significant components derived from the spectra leads to the conclusion that the heterogeneity of protein structure is a fundamental limitation to the use of such spectral analysis methods. The underutilization of these data for prediction of secondary structure suggests spectral data could predict a more detailed descriptor.
Experimental and computational aspects of the quantitative analysis of vibrational circular dichroism (VCD) of proteins are discussed. Experimentally, the effect of spectral resolution, sample concentration, cell selection and spectral normalization effects are considered. The influence of random intensity variations on the results of quantitative analysis of amide I' VCD are shown to be minor up to a 15% variation in spectral intensity. A computational algorithm, based on factor analysis of the spectra and multiple linear regression calculation of fractions of secondary structures (FC), was designed to analyse quantitatively the details of the VCD spectra-structure relationship. It also enabled the results of VCD measured independently for the amide I' and amide I1 regions to be combined. Our study is based primarily on the optimization of the calculation to predict FC values for proteins not included in the reference data set used for regression. The best prediction is obtained with the function using only part of the observable independent VCD spectral components. Inclusion of all components actually reduces the prediction accuracy of the analysis. Spectroscopic reasons for such behaviour and the consequences of the interdependence of the crystallographic FC values on the spectra-structure analysis are discussed. Finally, the possibility of utilizing VCD spectra to obtain quantitative structural information about the protein beyond the conventional secondary structure composition is explored. A matrix descriptor of supersecondary structure features for proteins is designed, and preliminary results for prediction of this descriptor from amide I' VCD spectra are presented. These latter calculations use a novel design of the back-propagation neural network.
Electronic circular dichroism (ECD) and vibrational circular dichroism (VCD) are compared with respect to their interconvertibility for protein structural studies. ECD and amide I' VCD spectra of 28 proteins were used with a backpropagation projection neural network with one hidden layer to develop a mapping between the two spectral types. After the network converged, the number of neurons in the hidden layer was optimized by principal component analysis of the synaptic weights of the pilot network topology with redundant hidden neurons. Actual prediction of one spectrum from the other for individual proteins was tested by retraining these networks with 28 reduced training sets having one protein systematically left out. Comparison of network-predicted spectra with experimental ones is used to identify those spectral features which are unique in each method. Similarly, the VCD spectra of 23 proteins measured in both D2O and H2O in the amide I region were mapped onto each other with the use of the same type of neural network calculation. The results show that the effects of partial deuteration on the VCD spectra band shape are predictable from the H2O spectra. An analysis of the synaptic weights of the optimized networks was performed which allowed identification of the linear and nonlinear parts of the obtained mappings. Insight into the details of how the neural networks encode and process the spectroscopic information is derived from a spectral representation of these weight matrices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.