MUC5B, mapped clustered with MUC6, MUC2, and MUC5AC to chromosome 11p15.5, is a human mucin gene of which the genomic organization is being elucidated. We have recently published the sequence and the peptide organization of its huge central exon, 10,713 base pairs (bp) in length. We present here the genomic organization of its 3 region, which encompasses 10,690 bp. The genomic sequence has been completely determined. The 3 region of MUC5B is composed of 18 exons ranging in size from 32 to 781 bp, contrasting thus with the very large central exon. The sizes of the 18 introns range from 114 to 1118 bp. Some repetitive sequences were identified in four introns. The peptide deduced from the sequence of the 18 exons consists of an 808-amino acid peptide. This carboxyl-terminal region exhibits extensive sequence similarity to MUC2, MUC5AC, and von Willebrand factor, particularly the number and the positions of the cysteine residues, suggesting that this domain may be derived from a common ancestral gene. The presence in these components of a cystine knot also found in growth factors such as transforming growth factor- is of particular interest. Moreover, one part of this peptide is identical to the 196-amino acid sequence deduced from the cDNA clone pSM2-1, which codes for a part of the high molecular weight mucin MG1 isolated from human sublingual gland. Considering the expression pattern of MUC5B and the origin of MG1, we can thus conclude that MUC5B encodes MG1.Mucus is the layer that covers, protects, and lubricates the luminal surfaces of epithelial respiratory, gastrointestinal, and reproductive tracts. These basic properties are due to the viscous and viscoelastic properties of mucins, the major glycoprotein components of mucus. Mucins constitute a family of high molecular mass glycoproteins synthesized by the goblet cells of the epithelia and in some cases by submucosal glands (for more complete reviews, see Refs. 1-3).Alterations of the biosynthesis of mucins affecting the protein core and/or the carbohydrate content linked to the peptide have been observed in numerous pathological situations such as various adenomas and carcinomas, inflammatory diseases such as cystic fibrosis, asthma, chronic bronchitis, or inflammatory bowel diseases (4 -7). Moreover, the hypersecretion of mucins and the presence of alternating hydrophobic and hydrophilic domains in mucins have been shown to play a central role in the pathogenesis of cholesterol gallstones (8, 9).All apomucins contain tandemly repeated sequences rich in threonine and/or serine. Due to the high carbohydrate content, the peptide moiety of mucins has been difficult to characterize. cDNA cloning has enabled researchers to approach the study of the mucins over the past decade. Today, the membrane-associated mucin MUC1 and the secreted MUC7 are the only mucins for which the full-length cDNA and the genomic organization have been reported (10 -13). Both were revealed to be, in fact, small mucins. A complete cDNA of the large secreted mucin MUC2 (14 -17) has been...