Cocoa (Theobroma cacao L.) seeds are the source of chocolate flavor. The flavor develops upon post-harvest fermentation during which seed proteins are degraded. From 100 days after pollination (DAP) to maturity (160-180 DAP), three major protein bands (44, 26 and 21 kDa) are present in seed extracts subjected to denaturing polyacrylamide gel electrophoresis. The 44 and 26 kDa proteins, making up 30-50~o of total mature seed protein, behave as classical storage proteins [1], in contrast to the 21 kDa protein which increases during development but does not degrade to the same extent upon germination.Eleven percent of 20000 clones from a 130 DAP cocoa seed 2gtl0 library were positive when probed with synthetic oligonucleotides derived from a portion (residues 4-14) of the 21 kDa protein's N-terminal amino acid sequence (AlaAsn-Ser-Pro-Val-Leu-Asp-Thr-Asp-Gly-AspGlu-Leu-Gln-Thr-His-Val-Gln-Tyr-Tyr).The nucleotide and deduced amino acid sequences of an essentially full-length cDNA are shown in Figure 1. The transcript includes a 5' 78-nucleotide sequence for a 26-amino acid signal peptide which is not present at the N-terminus of the mature protein and a 3' 54-nucleotide poly(A) + tract, preceeded by two 3' AAUAAA elements. The calculated molecular weight of the mature protein (21331 Da)is similar to the sizes of protease inhibitors of the soybean trypsin inhibitor (Kunitz) class.The deduced amino acid sequence of the cocoa seed protein shows 38~o identity to a barely ct-amylase/subtilisin inhibitor (BASI [5], Fig. 2). The areas of greatest homology between the two proteins reflect areas of homology between them and two other Kunitz-type inhibitors (trypsin inhibitors of soybean [4] and winged bean [6], Fig. 2). Approximately 74~o (25 out of 34) of the residues conserved in all three of the protease inhibitors shown in Fig. 2 are also common to the cocoa protein. Considerably more identity is found among the sequences of the four proteins in the first 65 residues (35 of 65 residues of the cocoa protein matching any of the other three proteins) than in the middle or C-terminal thirds of the proteins. In addition, the highly conserved region between residues 4 and 24 (12 out of 24) is also highly conserved between at least four other Kunitz-type protease inhibitors from seeds of leguminous plants [3]. Four cysteine residues strictly conserved among the amino acid sequences of the protease inhibitors, and believed to be involved in disulfide bonding in BASI, are also conserved in the cocoa seed protein. The protein also shows much similarity (34~o identity) to sporamin b [2] of sweet potato tubers (Fig. 2). Many areas of high homology between the two The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession number X54509 Fig. 1. Nucleotide and deduced amino acid sequence of a cDNA clone encoding the 21 kDa cocoa seed protein. The arrow indicates the probable cleavage site of the putative 26-amino acid signal polypeptide from the mature protein. The u...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.