Gene Regulatory Networks (GRNs) control many aspects of cellular processes including cell differentiation, maintenance of cell type specific states, signal transduction, and response to stress. Since GRNs provide information that is essential for understanding cell function, the inference of these networks is one of the key challenges in systems biology. Leading algorithms to reconstruct GRN utilize, in addition to gene expression data, prior knowledge such as Transcription Factor (TF) DNA binding motifs or results of DNA binding experiments. However, such prior knowledge is typically incomplete hence resulting in missing values. Therefore, the integration of such incomplete prior knowledge with gene expression to elucidate the underlying GRNs remains difficult.To address this challenge we introduce NetREX-CF -Regulatory Network Reconstruction using EXpression and Collaborative Filtering -a GRN reconstruction approach that brings together a modern machine learning strategy (Collaborative Filtering model) and a biologically justified model of gene expression (sparse Network Component Analysis based model). The Collaborative Filtering (CF) model is able to overcome the incompleteness of the prior knowledge and make edge recommends for building the GRN. Complementing CF, the sparse Network Component Analysis (NCA) model can use gene expression data to validate the recommended edges. Here we combine these two approaches using a novel data integration method and show that the new approach outperforms the currently leading GRN reconstruction methods.Furthermore, our mathematical formalization of the model has lead to a complex optimization problem of a type that has not been attempted before. Specifically, the formulation contains 0 norm that can not be separated from other variables. To fill this gap, we introduce here a new method Generalized PALM (GPALM) that allows us to solve a broad class of non-convex optimization problems and prove its convergence.
NetREX-CF -Method OverviewThe NetREX-CF model is a novel data integration framework for reconstructing GRNs by organically utilizing both gene expression E and a set of prior networks P = {P 1 , ...P d }. The main idea behind the NetREX-CF model is an integration of two complementary optimization strategies: (i) a machine learning component designed based on Collaborative Filtering that is able to identify hidden features from the current observed prior networks P and utilize these features to recommend an improved GRN and (ii) a sparse NCA-based network remodelling component that can refine the topology of a GRN based on given gene expression E. These two computational components operate alternatively. The CF component recommends new edges to the current GRN and the sparse NCA-based network remodelling component screens the recommended edges and keeps the edges that are essential to explain the given gene expression. Once the sparse NCA-based network remodelling component confirms some of the recommended edges, the CF component further utilizes those retained recomm...