The NAC domain was originally characterized from consensus sequences from petunia NAM and from Arabidopsis ATAF1, ATAF2, and CUC2. Genes containing the NAC domain (NAC family genes) are plant-specific transcriptional regulators and are expressed in various developmental stages and tissues. We performed a comprehensive analysis of NAC family genes in Oryza sativa (a monocot) and Arabidopsis thaliana (a dicot). We found 75 predicted NAC proteins in full-length cDNA data sets of O. sativa (28,469 clones) and 105 in putative genes (28,581 sequences) from the A. thaliana genome. NAC domains from both predicted and known NAC family proteins were classified into two groups and 18 subgroups by sequence similarity. There were a few differences in amino acid sequences in the NAC domains between O. sativa and A. thaliana. In addition, we found 13 common sequence motifs from transcriptional activation regions in the C-terminal regions of predicted NAC proteins. These motifs probably diverged having correlations with NAC domain structures. We discuss the relationship between the structure and function of the NAC family proteins in light of our results and the published data. Our results will aid further functional analysis of NAC family genes.
We collected and completely sequenced 28,469 full-length complementary DNA clones from Oryza sativa L. ssp. japonica cv. Nipponbare. Through homology searches of publicly available sequence data, we assigned tentative protein functions to 21,596 clones (75.86%). Mapping of the cDNA clones to genomic DNA revealed that there are 19,000 to 20,500 transcription units in the rice genome. Protein informatics analysis against the InterPro database revealed the existence of proteins presented in rice but not in Arabidopsis. Sixty-four percent of our cDNAs are homologous to Arabidopsis proteins.
Rice (Oryza sativa L.) is a model organism for the functional genomics of monocotyledonous plants since the genome size is considerably smaller than those of other monocotyledonous plants. Although highly accurate genome sequences of indica and japonica rice are available, additional resources such as full-length complementary DNA (FL-cDNA) sequences are also indispensable for comprehensive analyses of gene structure and function. We cross-referenced 28.5K individual loci in the rice genome defined by mapping of 578K FL-cDNA clones with the 56K loci predicted in the TIGR genome assembly. Based on the annotation status and the presence of corresponding cDNA clones, genes were classified into 23K annotated expressed (AE) genes, 33K annotated non-expressed (ANE) genes, and 5.5K non-annotated expressed (NAE) genes. We developed a 60mer oligo-array for analysis of gene expression from each locus. Analysis of gene structures and expression levels revealed that the general features of gene structure and expression of NAE and ANE genes were considerably different from those of AE genes. The results also suggested that the cloning efficiency of rice FL-cDNA is associated with the transcription activity of the corresponding genetic locus, although other factors may also have an effect. Comparison of the coverage of FL-cDNA among gene families suggested that FL-cDNA from genes encoding rice- or eukaryote-specific domains, and those involved in regulatory functions were difficult to produce in bacterial cells. Collectively, these results indicate that rice genes can be divided into distinct groups based on transcription activity and gene structure, and that the coverage bias of FL-cDNA clones exists due to the incompatibility of certain eukaryotic genes in bacteria.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.