The nucleotide sequence of the human procollagen a1 (11) collagen gene extending from within the first intron through exon 15, and part of the 15th intron has been determined. This sequence analysis (7056 bases) identifies the intron/exon organization of the region of this gene encoding the N-propeptide and part of the triple-helical domain. Structural comparison of this with the genes of other human fibrillar collagens shows considerable diversity in terms of size and number of introns and exons that encodes the N-propeptide domain. Although the genomic structure of the human procollagen al(I1) gene is quite different from the rat procollagen al(1I) gene, the nucleotide coding sequences are 89% identical.Type I1 collagen is the major structural component of cartilage. It represents a genetically distinct member of fibrillar collagen family and is comprised of three identical aI(11) chains [l -31. Fibrillar procollagens are synthesized in precursor forms containing propeptides at both the amino and carboxyl ends of the molecules. Propeptides are thought to be involved in alignment of the three procollagens chains during triple-helix formation and in their secretion [l -31. Propeptides are cleaved after secretion of the native type I1 collagen molecule [l -31. This process and the coordinated expression of collagen genes are vitally important during vertebrate development [4]. Recently, genetic lesions in the procollagen al(I1) gene have been identified as the basis of several human diseases [5 -81.The human procollagen al(I1) gene has been isolated and characterized using a cosmid clone and a series of overlapping genomic clones [8, 91. The entire gene spans six large EcoRI fragments with sizes of 4.8, 7.3, 5.2, 9.2, 3.7 and 4.3 kb in the order of 5' end to 3' end of the gene. The nucleotide sequence of the 3' region of the gene encoding the C-propeptide has been determined [9]. Recently, overlapping cDNA clones encoding the complete helical region of the human procollagen al(I1) gene were isolated and identified [5]. Analysis of the cDNAs revealed the nucleotide sequence of the mRNA and the deduced amino acid sequence. Exons in the triple-helical and the C-propeptide region of human type I1 collagen are identical in size to exons of the chicken gene [ intronic structure in the triple-helical and the N-propeptide region has not been fully identified [12, 131. In this study, we present genomic nucleotide sequence of 6885 residues of the human procollagen a1 (11) gene encoding the 3' end of the first intron continuously through the 5' end of the 15th intron. This area covers the complete N-propeptide (except exon I) and residues 1-272 of the triple-helical domain of this gene. This information, for the first time, identified the genomic organization of the human type I1 collagen gene. Structural comparison of this gene with other human fibril-forming collagen genes and the rat type I1 collagen gene is presented.
MATERIALS AND METHODS
PlusmidThe 7.3-kb and 9.2-kb EcoRI fragments containing portions of the human ...