Data, software and supplementary material are available from http://www.bioss.sari.ac.uk/staff/adriano/research.html
There have been various attempts to reconstruct gene regulatory networks from microarray expression data in the past. However, owing to the limited amount of independent experimental conditions and noise inherent in the measurements, the results have been rather modest so far. For this reason it seems advisable to include biological prior knowledge, related, for instance, to transcription factor binding locations in promoter regions or partially known signalling pathways from the literature. In the present paper, we consider a Bayesian approach to systematically integrate expression data with multiple sources of prior knowledge. Each source is encoded via a separate energy function, from which a prior distribution over network structures in the form of a Gibbs distribution is constructed. The hyperparameters associated with the different sources of prior knowledge, which measure the influence of the respective prior relative to the data, are sampled from the posterior distribution with MCMC. We have evaluated the proposed scheme on the yeast cell cycle and the Raf signalling pathway. Our findings quantify to what extent the inclusion of independent prior knowledge improves the network reconstruction accuracy, and the values of the hyperparameters inferred with the proposed scheme were found to be close to optimal with respect to minimizing the reconstruction error.
There have been various attempts to improve the reconstruction of gene regulatory networks from microarray data by the systematic integration of biological prior knowledge. Our approach is based on pioneering work by Imoto et al. where the prior knowledge is expressed in terms of energy functions, from which a prior distribution over network structures is obtained in the form of a Gibbs distribution. The hyperparameters of this distribution represent the weights associated with the prior knowledge relative to the data. We have derived and tested a Markov chain Monte Carlo (MCMC) scheme for sampling networks and hyperparameters simultaneously from the posterior distribution, thereby automatically learning how to trade off information from the prior knowledge and the data. We have extended this approach to a Bayesian coupling scheme for learning gene regulatory networks from a combination of related data sets, which were obtained under different experimental conditions and are therefore potentially associated with different active subpathways. The proposed coupling scheme is a compromise between (1) learning networks from the different subsets separately, whereby no information between the different experiments is shared; and (2) learning networks from a monolithic fusion of the individual data sets, which does not provide any mechanism for uncovering differences between the network structures associated with the different experimental conditions. We have assessed the viability of all proposed methods on data related to the Raf signaling pathway, generated both synthetically and in cytometry experiments.
β-glucosidases are enzymes that catalyze the hydrolysis of oligosaccharides and disaccharides, such as cellobiose. These enzymes play a key role in cellulose degrading, such as alleviating product inhibition of cellulases. Consequently, they have been considered essential for the biofuel industry. However, the majority of the characterized β-glucosidases is inhibited by glucose. Hence, glucose-tolerant β-glucosidases have been targeted to improve the production of second-generation biofuels. In this paper, we proceeded a systematic literature review (SLR), collected protein structures and constructed a database of glucose-tolerant β-glucosidases, called betagdb. SLR was performed at PubMed, ScienceDirect and Scopus Library databases and conducted according to PRISMA framework. It was conducted in five steps: i) analysis of duplications, ii) title reading, iii) abstract reading, iv) diagonal reading, and v) full-text reading. The second, third, fourth, and fifth steps were performed independently by two researchers. Besides, we performed bioinformatics analysis on the collected data, such as structural and multiple alignments to detect the most conserved residues in the catalytic pocket, and molecular docking to characterize essential residues for substrate recognizing, glucose tolerance, and the β-glucosidase activity. We selected 27 papers, 23 sequences, and 5 PDB files of glucose-tolerant β-glucosidases. We characterized 11 highly conserved residues: H121, W122, N166, E167, N297, Y299, E355, W402, E409, W410, and F418. The presence of these residues may be essential for β-glucosidases. We also discussed the importance of residues W169, C170, L174, H181, and T226. Furthermore, we proposed that the number of contacts for each residue in the catalytic pocket might be a metric that could be used to suggest mutations. We believe that the herein propositions, together with the sequence and structural data collection, might be helpful for effective engineering of β-glucosidases for biofuel production and may help to shed some light on the degradation of cellulosic biomass.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.