Several novel UDP glycosyltransferase (UGT) genes, mainly UDP glucuronosyltransferases, have been identified in the human, mouse and rat genomes and in other mammalian species. This review provides an update of the UGT nomenclature to include these new genes and prevent the confusion that arises when the same gene is given different names. The new genes are named following previously established recommendations, taking into consideration evolutionary relatedness and the names already in general usage in the literature. The mammalian UGT gene superfamily currently has 117 members that can be divided into four families, UGT1, UGT2, UGT3 and UGT8. The 5-exon genes of the UGT1 family each contain a unique first exon, plus four exons that are shared between the genes; the exons 1 appear to have evolved by a process of duplication, leading to the synthesis of proteins with identical carboxyl-terminal and variable amino-terminal domains. Exon-sharing is also seen with the 6-exon UGT2A1 and UGT2A2 genes. However, UGT2A3 and those of the UGT2B (six exons), UGT3 (seven exons) and UGT8 gene families (five or six exons) do not share exons and most likely were derived by a process of duplication of all exons in the gene. Most UGT1 and UGT8 enzymes have been characterized in detail; however, the catalytic functions of the UGT3A enzymes and several UGT2 enzymes remain to be characterized.
The original novel UGT1 complex locus previously shown to encode six different UDP-glucuronosyltransferase (transferase) genes has been extended and demonstrated to specify a total of 13 isoforms. The genes are designated UGT1A1 through UGT1A13p with four pseudo ones. UGT1A2p and UGT1A11p through UGT1A13p have either nucleotide deletions or flawed TATA boxes and are therefore pseudo. In the 5' region of the locus, the 13 unique exons 1 are arranged in a tandem array with each having its own proximal TATA box element and, in turn, are linked to four common exons to allow for the independent transcriptional initiation to generate overlapping primary transcripts. Only the lead exon in the nine viable primary transcripts is predicted to undergo splicing to the four common exons generating mRNAs with identical 3' ends and transferase isozymes with an identical carboxyl terminus. The unique amino terminus specifies acceptor-substrate selection, and the common carboxyl terminus apparently specifies the interaction with the common donor substrate, UDP-glucuronic acid. In the extended region, the viable TATA boxes are either A(A)TgA(AA)T or AT14AT; in the original locus the element for UGT1A1 is A(TA)7A and TAATT/CAA(A) for all of the other genes. UGT1A1 specifies the critically important bilirubin transferase isoform. The relationships of the exons 1 to each other are as follows: UGT1A2p through UGT1A5 comprises a cluster A that is 87-92% identical, and UGT1A7 through UGT1A13p comprises a cluster B that is 67-91% identical. For the two not included in a cluster, UGT1A1 is more identical to cluster A at 60-63%, whereas UGT1A6 is identical by between 48% and 56% to all other unique exons. The locus was expanded from 95 kb to 218 kb. Extensive probing of clones beyond 218 kb with coding nucleotides for a highly conserved amino acid sequence present in all transferases was unable to detect other exons 1. The mRNAs are differentially expressed in hepatic and extrahepatic tissues. This locus is indeed novel, indicating the least usage of exon sequences in specifying different transferase isozymes that have an expansive substrate range.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.