An approach to the de novo structure prediction of proteins is described that relies on surface accessibility data from NMR paramagnetic relaxation enhancements by a soluble paramagnetic compound (sPRE). This method exploits the distance-to-surface information encoded in the sPRE data in the chemical shift-based CS-Rosetta de novo structure prediction framework to generate reliable structural models. For several proteins, it is demonstrated that surface accessibility data is an excellent measure of the correct protein fold in the early stages of the computational folding algorithm and significantly improves accuracy and convergence of the standard Rosetta structure prediction approach.
Duringthelastfewdecades,NMRspectroscopyhasbecomethe method of choice for studying high-resolution protein structures in solution. In the standard NMR-based structure determination approach, structurally relevant data from different sources, such as pair-wise interatomic distances and orientation information, are collected and used as restraints for structure calculation. [1] Very recently, several groups have realized that the growing number of structural data available in the Protein Data Base [2] (PDB) provide a valuable source for NMR-based structure determination, in particular when combined with NMR chemical shifts. [3] In these de novo structure prediction approaches, only the amino acid sequence is needed, and structures are calculated in an often Monte Carlo-based conformation-searching algorithm. The benefits of NMR chemical shift data in fragment selection and evaluation of structural quality have been recognized [4] and impressively demonstrated. [3,5] However, this method is still limited to small proteins owing to computational bottlenecks [6] and requires extensive sets of NMR-based structural data, which are difficult to obtain in case of larger proteins as a result of the increasing complexity of NMR spectra and line broadening of NMR signals because of overall slower protein tumbling.Herein we describe an approach in which we exploit NMR-based surface accessibility data obtained from measurement of paramagnetic relaxation enhancements induced by a soluble paramagnetic compound for de novo structure prediction in the Rosetta framework. [6,7] The addition of soluble paramagnetic compounds leads to a concentrationdependent increase of relaxation rates, the so-called paramagnetic relaxation enhancement (here denoted as solvent PRE, sPRE; also known as co-solute PRE, Figure 1 a). This effect depends on the distance of the spin to the protein surface, with the spins on the surface being affected most, and has been shown to correlate well with protein structure. [8] sPREs have been exploited for structural studies of biomol- Figure 1. Principle of sPRE-CS-Rosetta. a) NMR sPRE data provides quantitative and residue specific information on the solvent accessibility as the effect of paramagnetic probes such as Gd(DTPA-BMA) is distance dependent. b) Back-calculation of sPRE data relies on placing the protein into e...