NCI, the C-terminal non-collagenous globular domain of collagen IV, represents one of the two end regions responsible for the assembly and cross-linking of the extracellular network of basement membrane collagen. Several cDNA clones for the NCI domain of the ctl(IV) collagen chain of mouse have been isolated by using synthetic oligonucleotides as screening probes for mouse libraries. The oligonucleotides were synthesized according to known stretches of the corresponding protein sequence. Sequencing of the overlapping cDNA clones allowed the complete amino acid sequence of the NCl domain to be deduced as well as the C-terminal 165 amino acid residues of the triple helix. It consists of 229 amino acid residues which comprise two homologous regions with a high content of cysteine. These DNA and protein sequences are compared to the corresponding sequences of other collagens and discussed with respect to their structural and biological significance.The main collagenous component of basement membranes is collagen IV. The triple helical molecule is 400 nm long and carries a globular domain at its carboxy-terminal [l -31. It contains two ctl (IV) chains and one a2(IV) chain, each consisting of approximately 1700 amino acid residues [4, 51. The triple-helical part of these chains is frequently interrupted by non-helical regions [6, 71, in contrast to interstitial collagen chains. In basement membranes, collagen IV forms a macromolecular network in which the 30-rim-long N-terminal region (7s domain) of four molecules are linked together. Each of these molecules becomes connected at the opposite end, via the C-terminal globule (NCI domain), to the NCI domain of yet another molecule. The network is stabilized by the formation of intermolecular cross-links at both ends [8, 91. This organization differs from the assembly of the interstitial collagens I, I1 and 111, whose 300-nm-long molecules are aligned in parallel but in a D-staggered array (D = 67 nm)While the amino acid sequence of large parts of the triplehelical region of the al(1V) chain from human [7] and mouse [6] is known, the C-terminal globular domain proved to be difficult to characterize, probably because of its hydrophobic character. Since the NCI domain is important for the assembly of the molecules, we decided to determine its amino acid sequence via the corresponding cDNA. Appropriate for such investigations is poly(A)+ RNA from tissues and cell lines which are known to produce basement membranes and type IV collagen, in particular the EHS mouse tumor [l I], the PYS-2 cell line [I, 121 and
1101.both sources contain very little translatable mRNA for the chains of type IV collagen, the amount being estimated to be less than 0.1 % (M. Laurent, I. Oberbaumer, unpublished). However, induced F9 cells have been reported to contain higher levels of these mRNAs [I 51.Here we will present our data on three cDNA clones, two from an EHS library and one from an F9 library, which cover the C-terminal area of the triple helix, the globular domain of the a1 (IV) chai...