BackgroundGene set analysis is a well-established approach for interpretation of data from high-throughput gene expression studies. Achieving reproducible results is an essential requirement in such studies. One factor of a gene expression experiment that can affect reproducibility is the choice of sample size. However, choosing an appropriate sample size can be difficult, especially because the choice may be method-dependent. Further, sample size choice can have unexpected effects on specificity.ResultsIn this paper, we report on a systematic, quantitative approach to study the effect of sample size on the reproducibility of the results from 13 gene set analysis methods. We also investigate the impact of sample size on the specificity of these methods. Rather than relying on synthetic data, the proposed approach uses real expression datasets to offer an accurate and reliable evaluation.ConclusionOur findings show that, as a general pattern, the results of gene set analysis become more reproducible as sample size increases. However, the extent of reproducibility and the rate at which it increases vary from method to method. In addition, even in the absence of differential expression, some gene set analysis methods report a large number of false positives, and increasing sample size does not lead to reducing these false positives. The results of this research can be used when selecting a gene set analysis method from those available.Electronic supplementary materialThe online version of this article (10.1186/s40246-019-0226-2) contains supplementary material, which is available to authorized users.
It is well known that the "store language" of every pushdown automaton -the set of store configurations (state and stack contents) that can appear as an intermediate step in accepting computations -is a regular language. Here many models of language acceptors with various store structures are examined, along with a study of their store languages. For each model, an attempt is made to find the simplest model that accepts their store languages. Some connections between store languages of one-way and two-way machines are demonstrated, as with connections between nondeterministic and deterministic machines. A nice application of these store language results is also presented, showing a general technique for proving families accepted by many deterministic models are closed under right quotient with regular languages, resolving some open questions (and significantly simplifying proofs for others that are known) in the literature. Lower bounds on the space complexity of Turing machines for having non-regular store languages are obtained.Hence, the following is immediate:Corollary 41. There is a middle log log n space-bounded 2DTM M such that S(M ) is not regular. Turning now to one-way machines:Proposition 42. If M is a middle s(n) space-bounded 1NTM and s(n) grows slower than log n, then S(M ) is regular.Proof. The proof is the same as the proof of Proposition 38 using Proposition 37, part 2, and noting that the M ′ constructed in that proof would also be one-way if M is one-way.Corollary 43. If M is a strongly s(n) space-bounded 1NTM and s(n) grows slower than log n, then S s (M ) is regular.
The family of stichotrichous ciliates have received a great deal of study due to the presence of scrambled genes in their genomes. The mechanism by which these genes are descrambled is of interest both as a biological process and as a model of natural computation. Several formal models of this process have been proposed, the most recent of which involves the recombination of DNA strands based on template guides. We generalize this template-guided DNA recombination model proposed by Prescott, Ehrenfeucht and Rozenberg to an operation on strings and languages. We then proceed to investigate the properties of this operation with the intention of viewing ciliate gene descrambling as a computational process.
Similarities and differences in the associations of biological entities among species can provide us with a better understanding of evolutionary relationships. Often the evolution of new phenotypes results from changes to interactions in pre-existing biological networks and comparing networks across species can identify evidence of conservation or adaptation. Gene co-expression networks (GCNs), constructed from high-throughput gene expression data, can be used to understand evolution and the rise of new phenotypes. The increasing abundance of gene expression data makes GCNs a valuable tool for the study of evolution in non-model organisms. In this paper, we cover motivations for why comparing these networks across species can be valuable for the study of evolution. We also review techniques for comparing GCNs in the context of evolution, including local and global methods of graph alignment. While some protein-protein interaction (PPI) bioinformatic methods can be used to compare co-expression networks, they often disregard highly relevant properties, including the existence of continuous and negative values for edge weights. Also, the lack of comparative datasets in non-model organisms has hindered the study of evolution using PPI networks. We also discuss limitations and challenges associated with cross-species comparison using GCNs, and provide suggestions for utilizing co-expression network alignments as an indispensable tool for evolutionary studies going forward.
Many different deletion operations are investigated applied to languages accepted by one-way and twoway deterministic reversal-bounded multicounter machines, deterministic pushdown automata, and finite automata. Operations studied include the prefix, suffix, infix and outfix operations, as well as left and right quotient with languages from different families. It is often expected that language families defined from deterministic machines will not be closed under deletion operations. However, here, it is shown that oneway deterministic reversal-bounded multicounter languages are closed under right quotient with languages from many different language families; even those defined by nondeterministic machines such as the contextfree languages. Also, it is shown that when starting with one-way deterministic machines with one counter that makes only one reversal, taking the left quotient with languages from many different language families -again including those defined by nondeterministic machines such as the context-free languages -yields only one-way deterministic reversal-bounded multicounter languages (by increasing the number of counters). These results are surprising given the nondeterministic nature of the deletion operation. However, if there are two more reversals on the counter, or a second 1-reversal-bounded counter, taking the left quotient (or even just the suffix operation) yields languages that can neither be accepted by deterministic reversal-bounded multicounter machines, nor by 2-way nondeterministic machines with one reversal-bounded counter.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.