Timothy G. Littlejohn scite author profile

Microarray technology has become one of the most important functional genomics technologies. A proliferation of microarray databases has resulted. It can be difficult for researchers exploring this technology to know which bioinformatics systems best meet their requirements. In order to obtain a better understanding of the available systems, a survey and comparative analysis of microarray databases was undertaken. The survey included databases that are currently available, as well as databases that should become available in early 2001. Databases fall into three categories: (i) those that can be installed locally, (ii) those available for public data submission and (iii) those available for public query. Developers of microarray gene-expression databases were asked questions regarding the scope and availability of their database, its system requirements, its future compliance with MGED (Microarray Gene Expression Database) standards, and its associated analytical tools. Participants included AMAD (Stanford/Berkeley/UCSF), ArrayExpress (EBI), ChipDB (MIT/Whitehead), GeneX (NCGR), GeNet (Silicon Genetics), GeneDirector (BioDiscovery), GEO (NCBI), GXD (Jackson Laboratory), mAdb (NCI), maxdSQL (University of Manchester), NOMAD (UCSF), RAD (University of Pennsylvania) and SMD (Stanford University). Other database developers were contacted but data was not available at the time of manuscript preparation. Each database fulfils a different role, reflecting the widely varying needs of microarray users.

show abstract

Peptide‐mass fingerprinting and the ideal covering set for protein characterisation

Wise

Littlejohn

Humphery‐Smith

1997

Electrophoresis

View full text Add to dashboard Cite

The rules that govern the dynamics of protein characterisation by peptide-mass fingerprinting (PMF) were investigated through multiple interrogations of a nonredundant protein database. This was achieved by analysing the efficiency of identifying each entry in the entire database via perfect in silico digestion with a series of 20 pseudo-endoproteinases cutting at the carboxy terminal of each amino acid residue, and the multiple cutters: trypsin, chymotrypsin and Glu-C. The distribution of peptide fragment masses generated by endoproteinase digestion was examined with a view to designing better approaches to protein characterisation by PMF. On average, and for both common and rare cutters, the combination of approximately two fragments was sufficient to identify most database entries. However, the rare cutters left more entries unidentified in the database. Total coverage of the entire database could not be achieved with one enzymatic cutter alone, nor when all 23 cutters were used together. Peptide fragments of > 5000 Da had little effect on the outcome of PMF to correctly characterise database entries, while those with low mass (near to 350 Da in the case of trypsin) were found to be of most utility. The most frequently occurring fragments were also found in this lower mass region. The maximum size of uncut database entries (those not containing a specific amino acid residue) ranged from 52,908 Da to 258,314 Da, while the failure rate for a single cutter in identifying database entries varied from 10,865 (8.4%) to 23,290 (18.1%). PMF is likely to be a mainstay of any high-throughput protein screening strategy for large-scale proteome analysis. A better understanding of the merits and limitations of this technique will allow researchers to optimise their protein characterisation procedures.

show abstract

Gene Expression Databases and Data Mining

et al. 2003

View full text Add to dashboard Cite

The DNA microarray technology has arguably caught the attention of the worldwide life science community and is now systematically supporting major discoveries in many fields of study. The majority of the initial technical challenges of conducting experiments are being resolved, only to be replaced with new informatics hurdles, including statistical analysis, data visualization, interpretation, and storage. Two systems of databases, one containing expression data and one containing annotation data are quickly becoming essential knowledge repositories of the research community. This present paper surveys several databases, which are considered pillars of research and important nodes in the network. This paper focuses on a generalized workflow scheme typical for microarray experiments using two examples related to cancer research. The workflow is used to reference appropriate databases and tools for each step in the process of array experimentation. Additionally, benefits and drawbacks of current array databases are addressed, and suggestions are made for their improvement.

show abstract

Common File Formats

Leonard

Littlejohn

Baxevanis³

2006

CP in Bioinformatics

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.