The RIKEN Mouse Gene Encyclopaedia Project, a systematic approach to determining the full coding potential of the mouse genome, involves collection and sequencing of full-length complementary DNAs and physical mapping of the corresponding genes to the mouse genome. We organized an international functional annotation meeting (FANTOM) to annotate the first 21,076 cDNAs to be analysed in this project. Here we describe the first RIKEN clone collection, which is one of the largest described for any organism. Analysis of these cDNAs extends known gene families and identifies new ones.
Over the past 5 years, microarrays have greatly facilitated large-scale analysis of gene expression levels. Although these arrays were not specifically geared to represent tissues and pathways known to be affected by diabetes, they have been used in both type 1 and type 2 diabetes research. To prepare a tool that is particularly useful in the study of type 1 diabetes, we have assembled a nonredundant set of 3,400 clones representing genes expressed in the mouse pancreas or pathways known to be affected by diabetes. We have demonstrated the usefulness of this clone set by preparing a cDNA glass microarray, the PancChip, and using it to analyze pancreatic gene expression from embryonic day 14.5 through adulthood in mice. The clone set and corresponding array are useful resources for diabetes research.
A heuristic algorithm for associating Gene Ontology (GO) defined molecular functions to protein domains as listed in the ProDom and CDD databases is described. The algorithm generates rules for function-domain associations based on the intersection of functions assigned to gene products by the GO consortium that contain ProDom and/or CDD domains at varying levels of sequence similarity. The hierarchical nature of GO molecular functions is incorporated into rule generation. Manual review of a subset of the rules generated indicates an accuracy rate of 87% for ProDom rules and 84% for CDD rules. The utility of these associations is that novel sequences can be assigned a putative function if sufficient similarity exists to a ProDom or CDD domain for which one or more GO functions has been associated. Although functional assignments are increasingly being made for gene products from model organisms, it is likely that the needs of investigators will continue to outpace the efforts of curators, particularly for nonmodel organisms. A comparison with other methods in terms of coverage and agreement was performed, indicating the utility of the approach. The domain-function associations and function assignments are available from our website http://www.cbil.upenn.edu/GO.An important early step in the postgenomic era is the characterization of the biochemical functions of gene products. Accurate computational predictions are a useful resource for both the community at large and the curators that eventually assign function to gene products. The Gene Ontology (GO) (Gene Ontology Consortium 2001) is an ontology, that is, a database of agreed-to terms for molecular functions, biological processes and cellular components. GO also includes relationships between terms such as specialization or part-whole relations. GO was developed to facilitate effective use of this information. We present an automatic method for leveraging curated GO function annotation of proteins to associate GO terms with protein domains that can then be applied to proteins that contain any of the domains.We make the assumption that the functions that a protein is capable of performing are determined by the protein domains that it contains. We use the simplest possible model of this kind; each domain contributes a function independently of any other domain in the protein. The basis of the algorithm, illustrated in Figure 1, is to identify, using an intersection procedure, the GO functions common to a set of proteins that each contain a domain. We determine whether or not a protein has a domain using sequence similarity. As part of the process of associating GO functions with a domain, we determine a p-value threshold for each domain that indicates the level of similarity needed to confer function. As demonstrated by Hegyi and Gerstein (2001), this is an important consideration. We contrast this approach with the comparison of entire proteins where similarities may be found between regions that are not responsible for the transferred annotation (Myers et al. ...
The Endocrine Pancreas Consortium was formed in late 1999 to derive and sequence cDNA libraries enriched for rare transcripts expressed in the mammalian endocrine pancreas. Over the past 3 years, the Consortium has generated 20 cDNA libraries from mouse and human pancreatic tissues and deposited >150,000 sequences into the public expressed sequence tag databases. A special effort was made to enrich for cDNAs from the endocrine pancreas by constructing libraries from isolated islets. In addition, we constructed a library in which fetal pancreas from Neurogenin 3 null mice, which consists of only exocrine and duct cells, was subtracted from fetal wild-type pancreas to enrich for the transcripts from the endocrine compartment. Sequence analysis showed that these clones cluster into 9,464 assembly groups (approximating unique transcripts) for the mouse and 13,910 for the human sequences. Of these, >4,300 were unique to Consortium libraries. We have assembled a core clone set containing one cDNA for each assembly group for the mouse and have constructed the corresponding microarray, termed "PancChip 4.0," which contains >9,000 nonredundant elements. We show that this PancChip is highly enriched for genes expressed in the endocrine pancreas. The mouse and human clone sets and corresponding arrays will be important resources for diabetes research. Diabetes 52:1604 -1610, 2003 D espite recent progress in -cell biology and diabetes research, tools for the treatment of diabetes have not changed fundamentally. Although it is now clear that islet transplantation is a valuable therapeutic approach, this solution is severely limited by the shortage of islet tissue. Over the past decade, significant advances have been made toward identifying the hierarchy of transcription factors that govern pancreatic development (1). In addition, it has been shown that embryonic stem cells can be differentiated in vitro toward insulin-producing cells, although the issue remains controversial (2-4). Despite these discoveries, major obstacles to the isolation, expansion, and differentiation of pancreatic endocrine stem and/or progenitor cells exist, including a lack of appropriate cell surface antibodies for sorting of progenitor cell populations and an only rudimentary understanding of the lineage of -cells during development and regeneration of the pancreas.To accelerate the progress toward the identification of endocrine precursor cells and factors that regulate the development and differentiation of -cells, the National Institute of Diabetes and Digestive and Kidney Diseases sponsored a program entitled "Functional Genomics of the Developing Endocrine Pancreas" in 1999. The Endocrine Pancreas Consortium was created in response to this program to construct and sequence cDNA libraries derived from multiple stages of pancreatic development. Its purpose was to provide the public expressed sequence tag (EST) databases with sequences from mouse and human endocrine pancreas to discover novel transcripts that could be incorporated into custom m...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.