Deep sequencing has enabled the investigation of a wide range of environmental microbial ecosystems, but the high memory requirements for de novo assembly of short-read shotgun sequencing data from these complex populations are an increasingly large practical barrier. Here we introduce a memory-efficient graph representation with which we can analyze the k-mer connectivity of metagenomic samples. The graph representation is based on a probabilistic data structure, a Bloom filter, that allows us to efficiently store assembly graphs in as little as 4 bits per k-mer, albeit inexactly. We show that this data structure accurately represents DNA assembly graphs in low memory. We apply this data structure to the problem of partitioning assembly graphs into components as a prelude to assembly, and show that this reduces the overall memory requirements for de novo assembly of metagenomes. On one soil metagenome assembly, this approach achieves a nearly 40-fold decrease in the maximum memory requirements for assembly. This probabilistic graph representation is a significant theoretical advance in storing assembly graphs and also yields immediate leverage on metagenomic assembly. metagenomics | compression D e novo assembly of shotgun sequencing reads into longer contiguous sequences plays an important role in virtually all genomic research (1). However, current computational methods for sequence assembly do not scale well to the volume of sequencing data now readily available from next-generation sequencing machines (1, 2). In particular, the deep sequencing required to sample complex microbial environments easily results in datasets that surpass the working memory of available computers (3, 4).Deep sequencing and assembly of short reads is particularly important for the sequencing and analysis of complex microbial ecosystems, which can contain millions of different microbial species (5, 6). These ecosystems mediate important biogeochemical processes but are still poorly understood at a molecular level, in large part because they consist of many microbes that cannot be cultured or studied individually in the lab (5, 7). Ensemble sequencing ("metagenomics") of these complex environments is one of the few ways to render them accessible, and has resulted in substantial early progress in understanding the microbial composition and function of the ocean, human gut, cow rumen, and permafrost soil (3,4,8,9). However, as sequencing capacity grows, the assembly of sequences from these complex samples has become increasingly computationally challenging. Current methods for short-read assembly rely on inexact data reduction in which reads from low-abundance organisms are discarded, biasing analyses towards high-abundance organisms (3, 4, 9).The predominant assembly formalism applied to short-read sequencing datasets is a de Bruijn graph (10-12). In a de Bruijn graph approach, sequencing reads are decomposed into fixedlength words, or k-mers, and used to build a connectivity graph. This graph is then traversed to determine contiguous...
Natural selection favors the evolution of brains that can capture fitness-relevant features of the environment's causal structure. We investigated the evolution of small, adaptive logic-gate networks (“animats”) in task environments where falling blocks of different sizes have to be caught or avoided in a ‘Tetris-like’ game. Solving these tasks requires the integration of sensor inputs and memory. Evolved networks were evaluated using measures of information integration, including the number of evolved concepts and the total amount of integrated conceptual information. The results show that, over the course of the animats' adaptation, i) the number of concepts grows; ii) integrated conceptual information increases; iii) this increase depends on the complexity of the environment, especially on the requirement for sequential memory. These results suggest that the need to capture the causal structure of a rich environment, given limited sensors and internal mechanisms, is an important driving force for organisms to develop highly integrated networks (“brains”) with many concepts, leading to an increase in their internal complexity.
One of the hallmarks of biological organisms is their ability to integrate disparate information sources to optimize their behavior in complex environments. How this capability can be quantified and related to the functional complexity of an organism remains a challenging problem, in particular since organismal functional complexity is not well-defined. We present here several candidate measures that quantify information and integration, and study their dependence on fitness as an artificial agent (“animat”) evolves over thousands of generations to solve a navigation task in a simple, simulated environment. We compare the ability of these measures to predict high fitness with more conventional information-theoretic processing measures. As the animat adapts by increasing its “fit” to the world, information integration and processing increase commensurately along the evolutionary line of descent. We suggest that the correlation of fitness with information integration and with processing measures implies that high fitness requires both information processing as well as integration, but that information integration may be a better measure when the task requires memory. A correlation of measures of information integration (but also information processing) and fitness strongly suggests that these measures reflect the functional complexity of the animat, and that such measures can be used to quantify functional complexity even in the absence of fitness data.
Zero-determinant strategies are a new class of probabilistic and conditional strategies that are able to unilaterally set the expected payoff of an opponent in iterated plays of the Prisoner’s Dilemma irrespective of the opponent’s strategy (coercive strategies), or else to set the ratio between the player’s and their opponent’s expected payoff (extortionate strategies). Here we show that zero-determinant strategies are at most weakly dominant, are not evolutionarily stable, and will instead evolve into less coercive strategies. We show that zero-determinant strategies with an informational advantage over other players that allows them to recognize each other can be evolutionarily stable (and able to exploit other players). However, such an advantage is bound to be short-lived as opposing strategies evolve to counteract the recognition.
Swarming behaviours in animals have been extensively studied owing to their implications for the evolution of cooperation, social cognition and predatorprey dynamics. An important goal of these studies is discerning which evolutionary pressures favour the formation of swarms. One hypothesis is that swarms arise because the presence of multiple moving prey in swarms causes confusion for attacking predators, but it remains unclear how important this selective force is. Using an evolutionary model of a predator-prey system, we show that predator confusion provides a sufficient selection pressure to evolve swarming behaviour in prey. Furthermore, we demonstrate that the evolutionary effect of predator confusion on prey could in turn exert pressure on the structure of the predator's visual field, favouring the frontally oriented, high-resolution visual systems commonly observed in predators that feed on swarming animals. Finally, we provide evidence that when prey evolve swarming in response to predator confusion, there is a change in the shape of the functional response curve describing the predator's consumption rate as prey density increases. Thus, we show that a relatively simple perceptual constraint-predator confusion-could have pervasive evolutionary effects on prey behaviour, predator sensory mechanisms and the ecological interactions between predators and prey.
Representations are internal models of the environment that can provide guidance to a behaving agent, even in the absence of sensory information. It is not clear how representations are developed and whether or not they are necessary or even essential for intelligent behavior. We argue here that the ability to represent relevant features of the environment is the expected consequence of an adaptive process, give a formal definition of representation based on information theory, and quantify it with a measure R.To measure how R changes over time, we evolve two types of networks-an artificial neural network and a network of hidden Markov gates-to solve a categorization task using a genetic algorithm. We find that the capacity to represent increases during evolutionary adaptation, and that agents form representations of their environment during their lifetime. This ability allows the agents to act on sensorial inputs in the context of their acquired representations and enables complex and context-dependent behavior.We examine which concepts (features of the environment) our networks are representing, how the representations are logically encoded in the networks, and how they form as an agent behaves to solve a task. We conclude that R should be able to quantify * These authors contributed equally.arXiv:1206.5771v2 [q-bio.NC] 6 Aug 2013 the representations within any cognitive system, and should be predictive of an agent's long-term adaptive success.
Evolutionary game theory is a successful mathematical framework geared towards understanding the selective pressures that affect the evolution of the strategies of agents engaged in interactions with potential conflicts. While a mathematical treatment of the costs and benefits of decisions can predict the optimal strategy in simple settings, more realistic settings such as finite populations, non-vanishing mutations rates, stochastic decisions, communication between agents, and spatial interactions, require agent-based methods where each agent is modeled as an individual, carries its own genes that determine its decisions, and where the evolutionary outcome can only be ascertained by evolving the population of agents forward in time. While highlighting standard mathematical results, we compare those to agent-based methods that can go beyond the limitations of equations and simulate the complexity of heterogeneous populations and an ever-changing set of interactors. We conclude that agent-based methods can predict evolutionary outcomes where purely mathematical treatments cannot tread (for example in the weak selection-strong mutation limit), but that mathematics is crucial to validate the computational simulations.
Evolutionary adaptation is often likened to climbing a hill or peak. While this process is simple for fitness landscapes where mutations are independent, the interaction between mutations (epistasis) as well as mutations at loci that affect more than one trait (pleiotropy) are crucial in complex and realistic fitness landscapes. We investigate the impact of epistasis and pleiotropy on adaptive evolution by studying the evolution of a population of asexual haploid organisms (haplotypes) in a model of N interacting loci, where each locus interacts with K other loci. We use a quantitative measure of the magnitude of epistatic interactions between substitutions, and find that it is an increasing function of K. When haplotypes adapt at high mutation rates, more epistatic pairs of substitutions are observed on the line of descent than expected. The highest fitness is attained in landscapes with an intermediate amount of ruggedness that balance the higher fitness potential of interacting genes with their concomitant decreased evolvability. Our findings imply that the synergism between loci that interact epistatically is crucial for evolving genetic modules with high fitness, while too much ruggedness stalls the adaptive process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.