Comparing the overlap between sets of differentially expressed genes (DEGs) within or between transcriptome studies is regularly used to infer similarities between biological processes. Significant overlap between two sets of DEGs is usually determined by a simple test. The number of potentially overlapping genes is compared to the number of genes that actually occur in both lists, treating every gene as equal. However, gene expression is controlled by transcription factors that bind to a variable number of transcription factor binding sites, leading to variation among genes in general variability of their expression. Neglecting this variability could therefore lead to inflated estimates of significant overlap between DEG lists. With computer simulations, we demonstrate that such biases arise from variation in the control of gene expression. Significant overlap commonly arises between two lists of DEGs that are randomly generated, assuming that the control of gene expression is variable among genes but consistent between corresponding experiments. More overlap is observed when transcription factors are specific to their binding sites and when the number of genes is considerably higher than the number of different transcription factors. In contrast, overlap between two DEG lists is always lower than expected when the genetic architecture of expression is independent between the two experiments. Thus, the current methods for determining significant overlap between DEGs are potentially confounding biologically meaningful overlap with overlap that arises due to variability in control of expression among genes, and more sophisticated approaches are needed.
Bacteria are important natural components of virtually every environment, including water systems. While many are beneficial to the ecosystem in which they are found, some can be indicators of pathogens that can endanger human health. Fecal coliform bacteria such as Escherichia coli are bacterial indicators that can originate from many of the same sources as pathogenic bacteria and serve as a sign that pathogens may be present. These bacterial counts can be influenced by many different well-studied environmental factors, including pH, temperature, and nutrient availability. In addition to these factors, mammalian and waterfowl presence can influence coliform abundance. While this area of research has been examined before, conflicting conclusions have been reached as to whether or not waterfowl abundance positively correlates with coliform bacteria abundance. Levels of E. coli as well as Enterococcus, a genus of non-coliform bacterial organisms that are also found in high concentrations in feces, were measured by membrane filtration of water samples collected from six freshwater lakes around Lakeland, FL and were isolated from fresh fecal samples that were simultaneously collected from waterfowl species present at the lakes. Results suggest a correlation between the abundance of E. coli and the presence of waterfowl.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.