Recent genome programmes have explored an increasing number of new genes with unknown function. The estimated 35 000 human genes encode more than 10 5 expressed proteins as the result of various mechanisms, such as alternative promotion of transcription, alternative splicing of the transcripts and alternative translational initiation. Chromosome rearrangement can also serve as a source for evolutionary heterogeneity.
SummaryChromosomal rearrangements apparently account for the presence of a primate-specific gene (protease serine 3) in chromosome 9. This gene encodes, as the result of alternative splicing, both mesotrypsinogen and trypsinogen 4. Whereas mesotrypsinogen is known to be a pancreatic protease, neither the chemical nature nor biological function of trypsinogen 4 has been explored previously. The trypsinogen 4 sequence contains two predicted translation initiation sites: an AUG site that codes for a 72-residue leader peptide on Isoform A, and a CUG site that codes for a 28-residue leader peptide on Isoform B. We report studies that provide evidence for the N-terminal amino acid sequence of trypsinogen 4 and the possible mechanism of expression of this protein in human brain and transiently transfected cells. We raised mAbs against a 28-amino acid synthetic peptide representing the leader sequence of Isoform B and against recombinant trypsin 4. By using these antibodies, we isolated and chemically identified trypsinogen 4 from extracts of both post mortem human brain and transiently transfected HeLa cells. Our results show that Isoform B, with a leucine N terminus, is the predominant (if not exclusive) form of the enzyme in post mortem human brain, but that both isoforms are expressed in transiently transfected cells. On the basis of our studies on the expression of a series of trypsinogen 4 constructs in two different cell lines, we propose that unconventional translation initiation at a CUG with a leucine, rather than a methionine, N terminus may serve as a means to regulate protein expression.Abbreviations GFP, green fluorescent protein; PRSS3, protease serine 3.