In Escherichia coli (E. coli), most DNA damage-inducible (din) genes belong to the LexA regulon, whose products are related to functions such as DNA repair and induced mutagenesis. The E. coli K-12 cells have about 30 operons that are known to be members of the LexA regulon. LexA acts as a transcriptional repressor of these unlinked genes by binding to the specifi c DNA sequences located within the promoter regions. We developed a genetic screening method to isolate LexA dependent promoters. By using an applied whole-genome shotgun method with a lac-operon system, we isolated promoter candidates of din genes from the E. coli O157:H7 genome. We found that transcriptional repression from most of these promoters was dependent on lexA and purifi ed LexA protein bound directly to the DNA fragments carrying them. Finally, we identifi ed 16 and 5 promoters that regulated expression of previously known and novel LexA dependent genes, respectively. In addition to them, we also identifi ed 2 antisense promoters which were considered to regulate expression of antisense RNAs for mRNAs of the ecs1779 and ecs2988 genes. All newly identifi ed promoter regions contained DNA sequences similar to the consensus LexA binding sequence.
Higher-order compression is a scheme for compressing data in the form of functional programs that generate the data. This compression scheme can be viewed a generalization of grammar-based compression, and retains its advantage that compressed data can be manipulated without decompression. Furthermore, the higherorder compression can achieve a high compression ratio and also discover patterns that cannot be found by traditional grammar-based compression. In this paper, we propose an efficient algorithm and a bit-coding scheme for higher-order compression and evaluate their effectiveness through experiments. Efficient Compression Algorithm As in the previous compression algorithm [KMS12], the input data is first represented as a λ-term, and the λ-term is compressed by repeatedly extracting common (higher-order) contexts from the λ-term. Unlike the previous algorithm (which extracts every context that occurs more than once), however, our new algorithm avoids the combinatorial explosion by considering only contexts up to a certain size, and extracting only the most frequent one among them. This idea has been borrowed from Re-Pair [LM99], one of the most successful grammar-based compression algorithms. Our approach extends this line of work and extracts a larger class of patterns, to enable higher-order compression. For instance, for aaabbbaaae, Re-Pair would replace "aa" with a new symbol. On the other hand, our compression method can extract a pattern of the form xxx, and replace the whole string with p a (p b (p a e)) where p = λx.λy.x(x(x(y))). Bit-coding Scheme The previous work did not discuss how to encode the functional programs into bit-strings. We propose an efficient bit-coding scheme for simply typed λ-terms, which suppresses the use of tag bits to represent the tree structure of λ-terms by utilizing type information. For example, the term a (a c c) (a c c) can be represented as just a sequence a a c c a c c (where the symbols a and c are actually represented as numbers); the former can be recovered from the latter using the types of a and c. Thus, a λ-term can be expressed as a triple consisting of type information, a sequence of numbers, and a symbol table (that maps each number to a symbol). The three parts are further compressed separately, using adaptive range coder and bzip2. The experimental results confirmed that the overall compression scheme often outperformed that of grammar-based compression for XML data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.