Computational models for predicting the activity of small molecules against targets are now routinely developed and used in academia and industry, partially due to public bioactivity databases. While models based on bigger datasets are the trend, recent studies such as chemogenomic active learning have shown that only a fraction of data is needed for effective models in many cases. In this article, the chemogenomic active learning method is discussed and used to newly analyze public databases containing nuclear hormone receptor and cytochrome P450 enzyme family bioactivity. In addition to existing results on kinases and G-protein coupled receptors, results here demonstrate the active learning methodology's effectiveness on extracting informative ligand-target pairs in sparse data scenarios. Experiments to assess the domain of the applicability demonstrate the influence of ligand profiles of similar targets within the family.
Efficient identification of chemical probes for the manipulation and understanding of biological systems demands specificity for target proteins. Computational means to optimize candidate compound selection for experimental selectivity evaluation are being sought. The active learning virtual screening method has demonstrated the ability to efficiently converge on predictive models with reduced datasets, though its applicability domain to probe identification has yet to be determined. In this article, we challenge active learning’s ability to predict inhibitory bioactivity profiles of selective compounds when learning from chemogenomic features found in non-selective ligand-target pairs. Comparison of controls versus multiple molecule representations de-convolutes factors contributing to predictive capability. Experiments using the matrix metalloproteinase family demonstrate maximum probe bioactivity prediction achieved from only approximately 20% of non-probe bioactivity; this data volume is consistent with prior chemogenomic active learning studies despite the increased difficulty from chemical biology experimental settings used here. Feature weight analyses are combined with a custom visualization to unambiguously detail how active learning arrives at classification decisions, yielding clarified expectations for chemogenomic modeling. The results influence tactical decisions for computational probe design and discovery.
Inflammatory cytokines are key signaling molecules that can promote an immune response, thus their RNA turnover must be tightly controlled during infection. Most studies investigate the RNA decay pathways in the cytosol or nucleoplasm but never focused on the nucleolus. Although this organelle has well-studied roles in ribosome biogenesis and cellular stress sensing, the mechanism of RNA decay within the nucleolus is not completely understood. Here, we report that the nucleolus is an essential site of inflammatory pre-mRNA instability during infection. RNA-sequencing analysis reveals that not only do inflammatory genes have higher intronic read densities compared with non-inflammatory genes, but their pre-mRNAs are highly enriched in nucleoli during infection. Notably, nucleolin (NCL) acts as a guide factor for recruiting cytosine or uracil (C/U)-rich sequence-containing inflammatory pre-mRNAs and the Rrp6-exosome complex to the nucleolus through a physical interaction, thereby enabling targeted RNA delivery to Rrp6-exosomes and subsequent degradation. Consequently, Ncl depletion causes aberrant hyperinflammation, resulting in a severe lethality in response to LPS. Importantly, the dynamics of NCL post-translational modifications determine its functional activity in phases of LPS. This process represents a nucleolus-dependent pathway for maintaining inflammatory gene expression integrity and immunological homeostasis during infection.
Male-specific region of the human Y chromosome (MSY) comprises 95% of its length that is functionally active. This portion inherits in block from father to male offspring. Most of the genes in the MSY region are involved in male-specific function, such as sex determination and spermatogenesis; also contains genes probably involved in other cellular functions. However, a detailed characterization of numerous MSY-encoded proteins still remains to be done. In this study, 12 uncharacterized proteins of MSY were analyzed through bioinformatics tools for structural and functional characterization. Within these 12 proteins, a total of 55 domains were found, with DnaJ domain signature corresponding to be the highest (11%) followed by both FAD-dependent pyridine nucleotide reductase signature and fumarate lyase superfamily signature (9%). The 3D structures of our selected proteins were built up using homology modeling and the protein threading approaches. These predicted structures confirmed in detail the stereochemistry; indicating reasonably good quality model. Furthermore the predicted functions and the proteins with whom they interact established their biological role and their mechanism of action at molecular level. The results of these structure-functional annotations provide a comprehensive view of the proteins encoded by MSY, which sheds light on their biological functions and molecular mechanisms. The data presented in this study may assist in future prognosis of several human diseases such as Turner syndrome, gonadal sex reversal, spermatogenic failure, and gonadoblastoma.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.