The nucleotide sequences of two developmentally early chorion cDNA clones from Bombyx mori define two distinct proline-rich chorion protein families, which we name CA and CB to indicate their homologies to the previously defined chorion protein families A and B, as well as the developmentally late and cysteine-rich HcA and HcB chorion families. Thus, the chorion gene superfamily has two symmetrical branches, each consisting of three families: the a branch (A, CA, HcA families) and the , ( branch (B, CB, HcB families).The evolution of the superfamily is discussed.The chorion or eggshell of silk moths is a complex extracellular structure: in the wild moth Antheraea polyphemus, as many as 186 polypeptides have been resolved by twodimensional gel electrophoresis (1); partial protein sequence analysis as well as extensive characterization of genomic and cDNA clones indicate that many of these components are encoded by distinct genes (reviewed in refs. 2 and 3).Despite this present-day complexity, there is an underlying evolutionary simplicity: a high proportion of the chorion genes are related by descent, constituting homologous gene families. This was first shown for the two predominant families known as A and B (2). Superficially distinct chorion proteins with very high cysteine content (Hc proteins) were shown by DNA sequencing to correspond to offshoots of the A and B families and were correspondingly named HcA and HcB (4-6). In each of these four families, the central part of the sequence (central domain) is most highly conserved, apparently for reasons of protein secondary structure (7). Extensive homologies are evident between A and HcA or B and HcB central domains. Furthermore, A and B central domains show distant similarities, suggesting that the chorion genes constitute a superfamily derived from a single ancestral gene (8, 9). The amino-and carboxyl-terminal sequences (left and right arms) are considerably more variable, but even they show similarities both within and among the chorion gene families.Little is known about the class of chorion proteins named C. That class is quite complex (42 of 186 electrophoretically resolved components in A. polyphemus; ref. 1), although it accounts for only ca. 10% of the chorion mass. It appears to play an important morphogenetic role: it constitutes the bulk of the early proteins that are responsible for formation of the initial chorion framework (10) and that are required for organization of the quantitatively dominant components, which are secreted later (11). Only one C component has been sequenced to date, in A. polyphemus, and has proved to be related to the B family (12).In the course of characterizing a complex set of genes expressed in early choriogenesis of Bombyx mori, we have sequenced two C-like cDNA clones. These clones show that the C class in fact includes two distinct families, related to the A and B families and accordingly named CA and CB. We discuss those two new families in the context of the evolutionary history of the entire superfamily.
M...