Introduction: RNA sequencing (RNA-seq) data from space biology experiments promise to yield invaluable insights into the effects of spaceflight on terrestrial biology. However, sample numbers from each study are low due to limited crew availability, hardware, and space. To increase statistical power, spaceflight RNA-seq datasets from different missions are often aggregated together. However, this can introduce technical variation or “batch effects”, often due to differences in sample handling, sample processing, and sequencing platforms. Several computational methods have been developed to correct for technical batch effects, thereby reducing their impact on true biological signals.Methods: In this study, we combined 7 mouse liver RNA-seq datasets from NASA GeneLab (part of the NASA Open Science Data Repository) to evaluate several common batch effect correction methods (ComBat and ComBat-seq from the sva R package, and Median Polish, Empirical Bayes, and ANOVA from the MBatch R package). Principal component analysis (PCA) was used to identify library preparation method and mission as the primary sources of batch effect among the technical variables in the combined dataset. We next quantitatively evaluated the ability of each of the indicated methods to correct for each identified technical batch variable using the following criteria: BatchQC, PCA, dispersion separability criterion, log fold change correlation, and differential gene expression analysis. Each batch variable/correction method combination was then assessed using a custom scoring approach to identify the optimal correction method for the combined dataset, by geometrically probing the space of all allowable scoring functions to yield an aggregate volume-based scoring measure.Results and Discussion: Using the method described for the combined dataset in this study, the library preparation variable/ComBat correction method pair out ranked the other candidate pairs, suggesting that this combined dataset should be corrected for library preparation using the ComBat correction method prior to downstream analysis. We describe the GeneLab multi-study analysis and visualization portal which will allow users to access the publicly available space biology ‘omics data, select multiple studies to combine for analysis, and examine the presence or absence of batch effects using multiple metrics. If the user chooses to perform batch effect correction, the scoring approach described here can be implemented to identify the optimal correction method to use for their specific combined dataset prior to analysis.
A new geochemical logging tool has been designed and developed for the precise determination of formation chemistry, mineralogy, and lithology, as well as the identification of total organic carbon (TOC). The primary elements identified by the system include aluminum, calcium, carbon, chlorine, hydrogen, iron, magnesium, oxygen, potassium, silicon, sulfur, thorium, titanium, and uranium. These elements are utilized to identify the minerals present in both conventional and unconventional formations. Tool operation begins by emitting high energy 14 MeV neutrons into the formation from a pulsed neutron generator, and the resulting gamma rays are intercepted by a high resolution, state of the art, LaBr3(Ce) detector. In order to exclude background gamma rays and provide a clean capture spectrum, a boron coating has been placed on the housing. The 3.25-inch tool diameter makes the system easier to operate in small boreholes as well as in horizontal wells. The extensive set of detected elements is made possible by the PNG, where high speed electronics are incorporated to accrue both capture and inelastic energy spectra. A Levenberg-Marquardt matrix inversion algorithm is employed to separate the spectra into their fundamental elemental components. Characterization of the system has been achieved through numerous measurements in more than 30 formations from a newly constructed Rock Formation Laboratory in Fort Worth, Texas as well as at the Callisto Facility in the United Kingdom. A significant number of core samples were obtained from these formations and analyzed for elemental and mineralogical composition. Extensive use of MCNP modeling was exploited for the design and characterization of the system. The final lithological and mineralogical interpretation is guided by the elemental concentrations of the various elements, as well as the computation of intrinsic sigma. Magnesium is used to differentiate between calcite and dolomite in carbonate formations. Aluminum, iron, and potassium, in addition to silicon, provide the information required to distinguish the various clays in sand/shale formations. Sulfur is vital for the identification of both pyrite and anhydrite. Ternary plots are generated to aid in the final interpretation. To demonstrate the effectiveness of this work, log examples from the field are provided.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.