Assigning functions to the vast array of proteins present in eukaryotic cells remains challenging. To identify relationships between proteins, and thereby enable functional annotations of proteins, we determined changes of abundance of 10,323 human proteins in response to 294 biological perturbations using isotope-labelling mass spectrometry. We applied the machine learning algorithm treeClust to reveal functional associations between co-regulated human proteins from ProteomeHD, a compilation of our own data and datasets from the Proteomics Identifications (PRIDE) database. This produced a co-regulation map of the human proteome. Co-regulation was able to capture relationships between proteins that do not physically interact or co-localize. For example, co-regulation of the peroxisomal membrane protein PEX11β with mitochondrial respiration factors led us to discover an organelle interface between peroxisomes and mitochondria ✉
Genes are not randomly distributed in the genome. In humans, 10% of protein‐coding genes are transcribed from bidirectional promoters and many more are organised in larger clusters. Intriguingly, neighbouring genes are frequently coexpressed but rarely functionally related. Here we show that coexpression of bidirectional gene pairs, and closeby genes in general, is buffered at the protein level. Taking into account the 3D architecture of the genome, we find that co‐regulation of spatially close, functionally unrelated genes is pervasive at the transcriptome level, but does not extend to the proteome. We present evidence that non‐functional mRNA coexpression in human cells arises from stochastic chromatin fluctuations and direct regulatory interference between spatially close genes. Protein‐level buffering likely reflects a lack of coordination of post‐transcriptional regulation of functionally unrelated genes. Grouping human genes together along the genome sequence, or through long‐range chromosome folding, is associated with reduced expression noise. Our results support the hypothesis that the selection for noise reduction is a major driver of the evolution of genome organisation.
Data-independent acquisition proteomics was used to study proteome changes of naive human neutrophils in rare monogenic diseases affecting their functions. Neutrophils of patients with mutations in the neutrophil elastase gene ELANE demonstrated global proteome dysregulation, whereas chronic granulomatous disease and leukocyte adhesion deficiency had modest effects on the respective neutrophil proteomes. Proteomics then guided targeted genetic assays to resolve two clinical cases with undetermined genetic causes, highlighting the usefulness of mass spectrometry-based clinical diagnostics.
No abstract
Animal pharmacokinetic (PK) data as well as human and animal in vitro systems are utilized in drug discovery to define the rate and route of drug elimination. Accurate prediction and mechanistic understanding of drug clearance and disposition in animals provide a degree of confidence for extrapolation to humans. In addition, prediction of in vivo properties can be used to improve design during drug discovery, help select compounds with better properties, and reduce the number of in vivo experiments. In this study, we generated machine learning models able to predict rat in vivo PK parameters and concentration–time PK profiles based on the molecular chemical structure and either measured or predicted in vitro parameters. The models were trained on internal in vivo rat PK data for over 3000 diverse compounds from multiple projects and therapeutic areas, and the predicted endpoints include clearance and oral bioavailability. We compared the performance of various traditional machine learning algorithms and deep learning approaches, including graph convolutional neural networks. The best models for PK parameters achieved R 2 = 0.63 [root mean squared error (RMSE) = 0.26] for clearance and R 2 = 0.55 (RMSE = 0.46) for bioavailability. The models provide a fast and cost-efficient way to guide the design of molecules with optimal PK profiles, to enable the prediction of virtual compounds at the point of design, and to drive prioritization of compounds for in vivo assays.
To provide a comprehensive analysis of small molecule genotoxic potential we have developed and validated an automated, high-content, high throughput, image-based in vitro Micronucleus (IVM) assay. This assay simultaneously assesses micronuclei and multiple additional cellular markers associated with genotoxicity. Acoustic dosing (≤ 2 mg) of compound is followed by a 24-h treatment and a 24-h recovery period. Confocal images are captured [Cell Voyager CV7000 (Yokogawa, Japan)] and analysed using Columbus software (PerkinElmer). As standard the assay detects micronuclei (MN), cytotoxicity and cell-cycle profiles from Hoechst phenotypes. Mode of action information is primarily determined by kinetochore labelling in MN (aneugencity) and γH2AX foci analysis (a marker of DNA damage). Applying computational approaches and implementing machine learning models alongside Bayesian classifiers allows the identification of, with 95% accuracy, the aneugenic, clastogenic and negative compounds within the data set (Matthews correlation coefficient: 0.9), reducing analysis time by 80% whilst concurrently minimising human bias. Combining high throughput screening, multiparametric image analysis and machine learning approaches has provided the opportunity to revolutionise early Genetic Toxicology assessment within AstraZeneca. By multiplexing assay endpoints and minimising data generation and analysis time this assay enables complex genotoxicity safety assessments to be made sooner aiding the development of safer drug candidates.
Background and Purpose: Functional brain imaging using genetically encoded Ca 2+ sensors in larval zebrafish is being developed for studying seizures and epilepsy as a more ethical alternative to rodent models. Despite this, few data have been generated on pharmacological mechanisms of action other than GABA A antagonism.Assessing larval responsiveness across multiple mechanisms is vital to test the translational power of this approach, as well as assessing its validity for detecting unwanted drug-induced seizures and testing antiepileptic drug efficacy.Experimental Approach: Using light-sheet imaging, we systematically analysed the responsiveness of 4 days post fertilisation (dpf; which are not considered protected under European animal experiment legislation) transgenic larval zebrafish to treatment with 57 compounds spanning more than 12 drug classes with a link to seizure generation in mammals, alongside eight compounds with no such link. Key Results: We show 4dpf zebrafish are responsive to a wide range of mechanisms implicated in seizure generation, with cerebellar circuitry activated regardless of the initiating pharmacology. Analysis of functional connectivity revealed compounds targeting cholinergic and monoaminergic reuptake, in particular, showed phenotypic
Subcellular localization is an important aspect of protein function, but the protein composition of many intracellular compartments is poorly characterized. For example, many nuclear bodies are challenging to isolate biochemically and thus remain inaccessible to proteomics. Here, we explore covariation in proteomics data as an alternative route to subcellular proteomes. Rather than targeting a structure of interest biochemically, we target it by machine learning. This becomes possible by taking data obtained for one organelle and searching it for traces of another organelle. As an extreme example and proof‐of‐concept we predict mitochondrial proteins based on their covariation in published interphase chromatin data. We detect about ⅓ of the known mitochondrial proteins in our chromatin data, presumably most as contaminants. However, these proteins are not present at random. We show covariation of mitochondrial proteins in chromatin proteomics data. We then exploit this covariation by multiclassifier combinatorial proteomics to define a list of mitochondrial proteins. This list agrees well with different databases on mitochondrial composition. This benchmark test raises the possibility that, in principle, covariation proteomics may also be applicable to structures for which no biochemical isolation procedures are available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.