UniFrac is a metric frequently used in microbiome research, but which does not scale to today's large datasets. We propose a new algorithm, Striped UniFrac, which produces identical results to prior algorithms, but with dramatically reduced memory and compute requirements. We highlight its utility by computing UniFrac on 113,721 samples in 48 hours using 256 CPUs. A BSD-licensed implementation is available that produces a C shared library linkable by any programming language (Supplementary Software and https:// github.com/biocore/unifrac). UniFrac 1 is a phylogenetic distance metric used to compare pairs of microbiome profiles. Microbiome studies now span tens of thousands of samples, such as the 27,751 sample Earth Microbiome Project (EMP) 2 or the 15,096 sample American Gut Project 3. Existing algorithms for computing UniFrac cannot scale in time or space to these study designs. For example, using Fast UniFrac with the EMP was projected to take months. Striped UniFrac produces identical results to existing algorithms, exhibits greater than 30-fold improvement in single-threaded performance, shows near linear parallel scaling (Supplementary fig. 1A,B), and can process the EMP dataset on a laptop in under 24 hours. It enables new biological insights, as shown by a meta-analysis 3 of the American Gut and Earth
Maximizing the recovery factor achieved through water flooding depends on acquiring a detailed understanding of the vertical and areal sweep efficiency. DNA diagnostics can monitor changes in oil contributions from multiple zones and from injectors, becoming a leading indicator for the potential of water breakthrough, loss of injectivity, and the overall advancement of the water front when combined with subsurface information. This allows for proactive management of injection rates and timing to maximize recovery rates for green fields and brownfields alike. DNA diagnostics use DNA markers acquired from microbes. DNA markers of produced fluids are compared to the DNA markers of injected fluids to establish relationships and shared fluid flow. This paper will cover the end to end workflow for long term waterflood monitoring:Establishing end members, even for a mature field, with the use of new samples from offset wells, properly stored samples from existing wells, and the analysis of commingled samples in combination with the subsurface model.Establishing the level of similarity between injectors and producers as an indication for the progression of the waterflood front using methods including Principal Coordinate Analysis (PCoA) of DNA marker profiles.Performing time series analysis and establishing sampling periodicity for effective waterflood monitoring. A pilot project, consisting of 12 producers and 3 injectors in a conventional California reservoir, was conducted to prove the concepts and further develop the required analysis for waterflood monitoring. Fluid samples were taken weekly on each well over 3 weeks to establish the difference in DNA markers between the fluids. The DNA markers were used to determine the probability that injection fluid was being produced from the surrounding wells. These results were overlaid to temporal changes in the Total Fluid Logs. Taken together, the results correlated and confirmed previous water breakthrough information and provided insights into arial and vertical conformance changes. Additionally, the project provided new insights into strength of producer and injector connection based on geological features and with that informing future infill drilling decisions. Waterflood monitoring is a powerful application for DNA diagnostics that is deployable on new and existing waterfloods. The spatial and temporal monitoring limitations of modeling or tracer studies can be improved upon through this non-invasive diagnostic. Initial results demonstrate the insights that can be provided not just for monitoring the waterflood but also for further field development decisions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.