Eukaryotic initiation factor (eIF)4E is over-expressed in many types of cancer such as breast, head and neck, and lung. A consequence of increased levels of eIF4E is the preferential translation of pro-tumorigenic proteins (e.g. c-Myc and vascular endothelial growth factor) and as a result is regarded as a potential therapeutic target. In this work a novel phage display peptide has been isolated against eIF4E. From the phage sequence two amino acids were delineated which improved binding when substituted into the eIF4G1 sequence. Neither of these substitutions were involved in direct interactions with eIF4E and acted either via optimization of the helical capping motif or restricting the conformational flexibility of the peptide. In contrast, substitutions of the remaining phage derived amino acids into the eIF4G1 sequence disrupted binding of the peptide to eIF4E. Interestingly when some of these disruptive substitutions were combined with key mutations from the phage peptide, they lead to improved affinities. Atomistic computer simulations revealed that the phage and the eIF4G1 derivative peptide sequences differ subtly in their interaction sites on eIF4E. This raises the issue, especially in the context of planar interaction sites such as those exhibited by eIF4E, that given the intricate plasticity of protein surfaces, the construction of structure-activity relationships should account for the possibility of significant movement in the spatial positioning of the peptide binding interface, including significant librational motions of the peptide.
BackgroundAnalyzing the human transcriptome is crucial in advancing precision medicine, and the plethora of over half a million human microarray samples in the Gene Expression Omnibus (GEO) has enabled us to better characterize biological processes at the molecular level. However, transcriptomic analysis is challenging because the data is inherently noisy and high-dimensional. Gene set analysis is currently widely used to alleviate the issue of high dimensionality, but the user-defined choice of gene sets can introduce biasness in results.In this paper, we advocate the use of a fixed set of transcriptomic modules for such analysis. We apply independent component analysis to the large collection of microarray data in GEO in order to discover reproducible transcriptomic modules that can be used as features for machine learning. We evaluate the usability of these modules across six studies, and demonstrate (1) their usage as features for sample classification, and also their robustness in dealing with small training sets, (2) their regularization of data when clustering samples and (3) the biological relevancy of differentially expressed features.ResultsWe identified 139 reproducible transcriptomic modules, which we term fundamental components (FCs). In studies with less than 50 samples, FC-space classification model outperformed their gene-space counterparts, with higher sensitivity (p < 0.01). The models also had higher accuracy and negative predictive value (p < 0.01) for small data sets (less than 30 samples). Additionally, we observed a reduction in batch effects when data is clustered in the FC-space. Finally, we found that differentially expressed FCs mapped to GO terms that were also identified via traditional gene-based approaches.ConclusionsThe 139 FCs provide biologically-relevant summarization of transcriptomic data, and their performance in low sample settings suggest that they should be employed in such studies in order to harness the data efficiently.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2338-4) contains supplementary material, which is available to authorized users.
Metal-binding proteins are ubiquitous in biological systems ranging from enzymes to cell surface receptors. Among the various biologically active metal ions, calcium plays a large role in regulating cellular and physiological changes. With the increasing number of high-quality crystal structures of proteins associated with their metal ion ligands, many groups have built models to identify Ca2+ sites in proteins, utilizing information such as structure, geometry, or homology to do the inference. We present a FEATURE-based approach in building such a model and show that our model is able to discriminate between nonsites and calcium-binding sites with a very high precision of more than 98%. We demonstrate the high specificity of our model by applying it to test sets constructed from other ions. We also introduce an algorithm to convert high scoring regions into specific site predictions and demonstrate the usage by scanning a test set of 91 calcium-binding protein structures (190 calcium sites). The algorithm has a recall of more than 93% on the test set with predictions found within 3 Å of the actual sites.
We report a novel modification of silicone elastomer, polydimethylsiloxane (PDMS) with a polymer graft that allows interfacial bonding between elastomer and glass substrate to be performed without exposure of said substrate to harsh treatment conditions like oxygen plasma. Organic molecules can thus be patterned within microfluidic channels and still remain functional post-bonding. In addition, after polymer grafting the PDMS can be stored in a desiccator for at least 40 days, and activated upon exposure to acidic buffer for bonding. The bonded devices remain fully bonded in excess of 80 psi driving pressure, with no signs of compromise to the bond integrity. Finally, we demonstrate the compatibility of our method with biological molecules using a proof-of-concept DNA sensing device, in which fluorescently-labelled DNA targets are successfully captured by a patterned probe in a device sealed using our method, while the pattern on a plasma-treated device was completely destroyed. Therefore, this method provides a much-needed alternative bonding process for incorporation of biological molecules in microfluidic devices.
Microarray measurements of gene expression constitute a large fraction of publicly shared biological data, and are available in the Gene Expression Omnibus (GEO). Many studies use GEO data to shape hypotheses and improve statistical power. Within GEO, the Affymetrix HG-U133A and HG-U133 Plus 2.0 are the two most commonly used microarray platforms for human samples; the HG-U133 Plus 2.0 platform contains 54 220 probes and the HG-U133A array contains a proper subset (21 722 probes). When different platforms are involved, the subset of common genes is most easily compared. This approach results in the exclusion of substantial measured data and can limit downstream analysis. To predict the expression values for the genes unique to the HG-U133 Plus 2.0 platform, we constructed a series of gene expression inference models based on genes common to both platforms. Our model predicts gene expression values that are within the variability observed in controlled replicate studies and are highly correlated with measured data. Using six previously published studies, we also demonstrate the improved performance of the enlarged feature space generated by our model in downstream analysis.Availability and ImplementationThe gene inference model described in this paper is available as a R package (affyImpute), which can be downloaded at http://simtk.org/home/affyimpute.Supplementary information Supplementary data are available at Bioinformatics online.
This paper builds upon the need for a more descriptive and accurate understanding of the landscape of intermolecular interactions, particularly those involving macromolecules such as proteins. For this, we need methods that move away from the single conformation description of binding events, toward a descriptive free energy landscape where different macrostates can coexist. Molecular dynamics simulations and molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) methods provide an excellent approach for such a dynamic description of the binding events. An alternative to the standard method of the statistical reporting of such results is proposed.
Background Consumer-grade wearable devices enable detailed recordings of heart rate and step counts in free-living conditions. Recent studies have shown that summary statistics from these wearable recordings have potential uses for longitudinal monitoring of health and disease states. However, the relationship between higher resolution physiological dynamics from wearables and known markers of health and disease remains largely uncharacterized. Objective We aimed to derive high-resolution digital phenotypes from observational wearable recordings and to examine their associations with modifiable and inherent markers of cardiometabolic disease risk. Methods We introduced a principled framework to extract interpretable high-resolution phenotypes from wearable data recorded in free-living conditions. The proposed framework standardizes the handling of data irregularities; encodes contextual information regarding the underlying physiological state at any given time; and generates a set of 66 minimally redundant features across active, sedentary, and sleep states. We applied our approach to a multimodal data set, from the SingHEART study (NCT02791152), which comprises heart rate and step count time series from wearables, clinical screening profiles, and whole genome sequences from 692 healthy volunteers. We used machine learning to model nonlinear relationships between the high-resolution phenotypes on the one hand and clinical or genomic risk markers for blood pressure, lipid, weight and sugar abnormalities on the other. For each risk type, we performed model comparisons based on Brier scores to assess the predictive value of high-resolution features over and beyond typical baselines. We also qualitatively characterized the wearable phenotypes for participants who had actualized clinical events. Results We found that the high-resolution features have higher predictive value than typical baselines for clinical markers of cardiometabolic disease risk: the best models based on high-resolution features had 17.9% and 7.36% improvement in Brier score over baselines based on age and gender and resting heart rate, respectively (P<.001 in each case). Furthermore, heart rate dynamics from different activity states contain distinct information (maximum absolute correlation coefficient of 0.15). Heart rate dynamics in sedentary states are most predictive of lipid abnormalities and obesity, whereas patterns in active states are most predictive of blood pressure abnormalities (P<.001). Moreover, in comparison with standard measures, higher resolution patterns in wearable heart rate recordings are better able to represent subtle physiological dynamics related to genomic risk for cardiometabolic disease (improvement of 11.9%-22.0% in Brier scores; P<.001). Finally, illustrative case studies reveal connections between these high-resolution phenotypes and actualized clinical events, even for borderline profiles lacking apparent cardiometabolic risk markers. Conclusions High-resolution digital phenotypes recorded by consumer wearables in free-living states have the potential to enhance the prediction of cardiometabolic disease risk and could enable more proactive and personalized health management.
BackgroundConsumer-grade wearable devices enable detailed recordings of heart rate and step counts in free-living conditions. Recent studies have shown that summary statistics from these wearable recordings have potential uses for longitudinal monitoring of health and disease states. However, the relationship between higher resolution physiological dynamics from wearables and known markers of health and disease remains largely uncharacterized.ObjectiveWe aimed to (i) derive high resolution digital phenotypes from observational wearable recordings, (ii) characterize their ability to predict modifiable markers of cardiometabolic disease, and (iii) study their connections with genetic predispositions for cardiometabolic disease and with lifestyle factors.MethodsWe introduce a principled framework to extract interpretable high resolution phenotypes from wearable data recorded in free-living conditions. The proposed framework standardizes handling of data irregularities, encodes contextual information about underlying physiological state at any given time, and generates a set of 66 minimally redundant features across active, sedentary and sleep states. We applied our approach on a multimodal dataset, from the SingHEART study (NCT02791152), that comprises of heart rate and step count time series from wearables, clinical screening profiles, whole genome sequences and lifestyle survey responses from 692 healthy volunteers. We employed machine learning to model non-linear relationships between the high resolution phenotypes and clinical risk markers for blood pressure, lipid and weight abnormalities. For each risk type, we performed model comparisons based on Brier Skill Scores (BSS) to assess predictive value of the high resolution features over and beyond typical baselines. We then examined associations between the wearable-derived features, polygenic risk for cardiometabolic disease, and lifestyle habits and health perceptions.ResultsCompared to typical summary statistic measures like resting heart rate, we find that the high-resolution features collectively have greater predictive value for modifiable clinical markers associated with cardiometabolic disease risk (average improvement in Brier Skill Score=52.3%, P<.001). Further, we show that heart rate dynamics from different activity states contain distinct information about type of cardiometabolic risk, with dynamics in sedentary states being most predictive of lipid abnormalities and patterns in active states being most predictive of blood pressure abnormalities (P<.001). Finally, our results reveal that subtle heart rate dynamics in wearable recordings serve as physiological correlates of genetic predisposition for cardiometabolic disease, lifestyle habits and health perceptions.ConclusionsHigh resolution digital phenotypes recorded by consumer wearables in free-living states have the potential to enhance prediction of cardiometabolic disease risk, and could enable more proactive and personalized health management.Trial RegistrationClinicalTrials.govNCT02791152; https://clinicaltrials.gov/ct2/show/NCT02791152
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.