The identification of potential regulatory motifs in new sequence data is increasingly important for experimental design. Those motifs are commonly located by matches to IUPAC strings derived from consensus sequences. Although this method is simple and widely used, a major drawback of IUPAC strings is that they necessarily remove much of the information originally present in the set of sequences. Nucleotide distribution matrices retain most of the information and are thus better suited to evaluate new potential sites. However, sufficiently large libraries of pre-compiled matrices are a prerequisite for practical application of any matrix-based approach and are just beginning to emerge. Here we present a set of tools for molecular biologists that allows generation of new matrices and detection of potential sequence matches by automatic searches with a library of pre-compiled matrices. We also supply a large library (> 200) of transcription factor binding site matrices that has been compiled on the basis of published matrices as well as entries from the TRANSFAC database, with emphasis on sequences with experimentally verified binding capacity. Our search method includes position weighting of the matrices based on the information content of individual positions and calculates a relative matrix similarity. We show several examples suggesting that this matrix similarity is useful in estimating the functional potential of matrix matches and thus provides a valuable basis for designing appropriate experiments.
We present a new version of the program MatInspector that identifies TFBS in nucleotide sequences using a large library of weight matrices. By introducing a matrix family concept, optimized thresholds, and comparative analysis, the enhanced program produces concise results avoiding redundant and false-positive matches. We describe a number of programs based on MatInspector allowing in-depth promoter analysis (DiAlignTF, FrameWorker) and targeted design of regulatory sequences (SequenceShaper).
Peptidoglycans from bacterial cell walls trigger immune responses in insects and mammals. A peptidoglycan recognition protein, PGRP, has been cloned from moths as well as vertebrates and has been shown to participate in peptidoglycan-mediated activation of prophenoloxidase in the silk moth. Here we report that Drosophila expresses 12 PGRP genes, distributed in 8 chromosomal loci on the 3 major chromosomes. By analyzing cDNA clones and genomic databases, we grouped them into two classes: PGRP-SA, SB1, SB2, SC1A, SC1B, SC2, and SD, with short transcripts and short 5-untranslated regions; and PGRP-LA, LB, LC, LD, and LE, with long transcripts and long 5-untranslated regions. The predicted structures indicate that the first group encodes extracellular proteins and the second group, intracellular and membrane-spanning proteins. Most PGRP genes are expressed in all postembryonic stages. Peptidoglycan injections strongly induce five of the genes. Transcripts from the different PGRP genes were found in immune competent organs such as fat body, gut, and hemocytes. We demonstrate that at least PGRP-SA and SC1B can bind peptidoglycan, and a function in immunity is likely for this family.
SUMMARY
Sexually dimorphic traits play key roles in animal evolution and behavior. Little is known, however, about the mechanisms governing their development and evolution. One recently evolved dimorphic trait is the male-specific abdominal pigmentation of Drosophila melanogaster, which is repressed in females by the Bric-à-brac (Bab) proteins. To understand the regulation and origin of this trait, we have identified and traced the evolution of the genetic switch controlling dimorphic bab expression. We show that the HOX protein Abdominal-B (ABD-B) and the sex-specific isoforms of Doublesex (DSX) directly regulate a bab cis-regulatory element (CRE). In females, ABD-B and DSXF activate bab expression whereas in males DSXM directly represses bab, which allows for pigmentation. A new domain of dimorphic bab expression evolved through multiple fine-scale changes within this CRE, whose ancestral role was to regulate other dimorphic features. These findings reveal how new dimorphic characters can emerge from genetic networks regulating pre-existing dimorphic traits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.