Tryptic digestion of a thermal unfolding intermediate of the phage P22 tailspike endorhamnosidase produces an N-terminally shortened protein fragment comprising amino-acid residues 108 -666 [Chen, B.-L. & King, J. (1991) Biochemistry 30, . In the present work, the 60-kDa C-terminal fragment was purified to homogeneity from the tryptic digest by gel-fitration chromatography. As is the case for the whole tailspike protein (72 kDa), the purified fragment was found to remain stably folded as a highly soluble, SDS-resistant, enzymatically active trimer. However, its unfolding in the presence of guanidinium chloride was accelerated at least 10-fold compared to the complete, native tailspike protein. Shortened tailspike trimers reconstituted spontaneously and with high yield after diluting a solution containing acid-urea-unfolded fragment polypeptides with neutral buffer. Upon recombinant expression of the 60-kDa polypeptide in Escherichia coli, it also assembled efficiently and formed SDS-resistant trimers. The refolding and assembly pathway of the Nterminally shortened tailspike paralleled that of the complete protein with slightly, but significantly, accelerated folding reactions, at both the subunit and the trimer levels. As found for the complete tailspike protein, yields of refolding and assembly of the 60-kDa fragments into SDS-resistant trimers decreased with increasing temperature. The refolding yield of fragments derived from a temperature-sensitive mutant (Gly244 + Arg) tailspike protein was affected in similar fashion as shown for the whole protein. We conclude that the N-terminal domain (residues 1-107) is dispensable for folding and assembly of the P22 tailspike endorhamnosidase both in vitro and in vivo.The native, functional structure of a folded protein is completely determined by the amino-acid sequence of its constitutent polypeptide chains [l -31. However, polypeptide sequence and three-dimensional structure are not connected by a simple, linear code, and attempts to predict the threedimensional structure of a protein from the amino-acid sequence have so far met with very limited success. There is general agreement that a successful structure prediction requires the understanding of protein folding pathways, and of the interactions stabilizing protein folding intermediates [4 -61. A number of observations suggest that protein folding intermediates are stabilized by a subset of the interactions present in the native protein structure [4, 7-91. In principle, however, parts of the amino-acid sequence may code for interactions only needed to guide the polypeptide through the correct folding pathway, but not used in the native, folded structure [lo-131.Observations on the effects of temperature-sensitive mutations in the gene coding for the tailspike protein of Salmonella typhimurium phage P22 have led to the proposal that the amino-acid sequence, indeed, contains information cru-