Motivation: Multiple biological clocks govern a healthy pregnancy. These biological mechanisms produce immunologic, metabolomic, proteomic, genomic and microbiomic adaptations during the course of pregnancy. Modeling the chronology of these adaptations during full-term pregnancy provides the frameworks for future studies examining deviations implicated in pregnancy-related pathologies including preterm birth and preeclampsia. Results: We performed a multiomics analysis of 51 samples from 17 pregnant women, delivering at term. The datasets included measurements from the immunome, transcriptome, microbiome, proteome and metabolome of samples obtained simultaneously from the same patients. Multivariate predictive modeling using the Elastic Net (EN) algorithm was used to measure the ability of each dataset to predict gestational age. Using stacked generalization, these datasets were combined into a single model. This model not only significantly increased predictive power by combining all datasets, but also revealed novel interactions between different biological modalities. Future work includes expansion of the cohort to preterm-enriched populations and in vivo analysis of immune-modulating interventions based on the mechanisms identified. Availability and implementation: Datasets and scripts for reproduction of results are available through: https://nalab.stanford.edu/multiomics-pregnancy/.
Testing zero variance components is one of the most challenging problems in the context of linear mixed-effects (LME) models. The usual asymptotic chi-square distribution of the likelihood ratio and score statistics under this null hypothesis is incorrect because the null is on the boundary of the parameter space. During the last two decades many tests have been proposed to overcome this difficulty, but these tests cannot be easily applied for testing multiple variance components, especially for testing a subset of them. We instead introduce a simple test statistic based on the variance least square estimator of variance components. With this comes a permutation procedure to approximate its finite sample distribution. The proposed test covers testing multiple variance components and any subset of them in LME models. Interestingly, our method does not depend on the distribution of the random effects and errors except for their mean and variance. We show, via simulations, that the proposed test has good operating characteristics with respect to Type I error and power. We conclude with an application of our process using real data from a study of the association of hyperglycemia and relative hyperinsulinemia.
We evaluated the application of gas chromatography-mass spectrometry metabolic fingerprinting to classify forward genetic mutants with similar phenotypes. Mutations affecting distinct metabolic or signaling pathways can result in common phenotypic traits that are used to identify mutants in genetic screens. Measurement of a broad range of metabolites provides information about the underlying processes affected in such mutants. Metabolite profiles of Arabidopsis (Arabidopsis thaliana) mutants defective in starch metabolism and uncharacterized mutants displaying a starch-excess phenotype were compared. Each genotype displayed a unique fingerprint. Statistical methods grouped the mutants robustly into distinct classes. Determining the genes mutated in three uncharacterized mutants confirmed that those clustering with known mutants were genuinely defective in starch metabolism. A mutant that clustered away from the known mutants was defective in the circadian clock and had a pleiotropic starch-excess phenotype. These results indicate that metabolic fingerprinting is a powerful tool that can rapidly classify forward genetic mutants and streamline the process of gene discovery.
The deployment of Deep neural networks (DNN) on edge devices has been difficult because they are resource hungry. Binary neural networks (BNN) help to alleviate the prohibitive resource requirements of DNN, where both activations and weights are limited to 1-bit. There is however a significant performance gap between BNNs and floating point DNNs. To reduce this gap, We propose an improved binary training method, by introducing a new regularization function that encourages training weights around binary values. In addition, we add trainable scaling factors to our regularization functions. We also introduce an improved approximation of the derivative of the sign activation function in the backward computation. These modifications are based on linear operations that are easily implementable into the binary training framework. We show experimental results on CIFAR-10 obtaining an accuracy of 87.4%, on AlexNet and 83.9% with DoReFa network. On ImageNet, our method also outperforms the traditional BNN method and XNOR-net, using AlexNet by a margin of 4% and 2% top-1 accuracy respectively. In other words, we significantly reduce the gap between BNNs and floating point DNNs.
According to Lim et al., based on World Health Organization (WHO) data, hazardous chemicals in the workplace are responsible for over 370,000 premature deaths annually. Despite these high figures, life cycle impact assessment (LCIA) does not yet include a fully operational method to consider occupational impacts in its scope over the entire supply chain. This paper describes a novel approach to account for occupational exposure to chemicals by inhalation in LCA. It combines labor statistics and measured occupational concentrations of chemicals from the OSHA database to calculate operational LCIA characterization factors (i.e., intakes per hour worked and impact intensities for 19,069 organic chemical/sector combinations with confidence intervals across the entire U.S. manufacturing industry). For the seven chemicals that most contribute to the global impact, measured workplace concentrations range between 5 × 10(-4) and 3 × 10(3) mg/m(3). Carcinogenic impacts range over 4 orders of magnitude, from 1.3 × 10(-8) and up to 3.4 × 10(-4) DALY per blue-collar worker labor hour. The innovative approach set out in this paper assesses health impacts from occupational exposure to chemicals with population exposure to outdoor emissions, making it possible to integrate occupational exposure within LCIA. It broadens the LCIA scope to analyze hotspots and avoid impact shifting.
The R package bclust is useful for clustering high-dimensional continuous data. The package uses a parametric spike-and-slab Bayesian model to downweight the effect of noise variables and to quantify the importance of each variable in agglomerative clustering. We take advantage of the existence of closed-form marginal distributions to estimate the model hyper-parameters using empirical Bayes, thereby yielding a fully automatic method. We discuss computational problems arising in implementation of the procedure and illustrate the usefulness of the package through examples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.