Induced pluripotent stem cell (iPSC) technology has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterisation of many existing iPSC lines limits their potential use for research and therapy. Here, we describe the systematic generation, genotyping and phenotyping of 711 iPSC lines derived from 301 healthy individuals by the Human Induced Pluripotent Stem Cells Initiative (HipSci: http://www.hipsci.org). Our study outlines the major sources of genetic and phenotypic variation in iPSCs and establishes their suitability as models of complex human traits and cancer. Through genome-wide profiling we find that 5-46% of the variation in different iPSC phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. Additionally, we assess the phenotypic consequences of rare, genomic copy number mutations that are repeatedly observed in iPSC reprogramming and present a comprehensive map of common regulatory variants affecting the transcriptome of human pluripotent cells.
Multiplexing strategies for large-scale proteomic analyses have become increasingly prevalent, tandem mass tags (TMT) in particular. Here we used a large iPSC proteomic experiment with twenty-four 10-plex TMT batches to evaluate the effect of integrating multiple TMT batches within a single analysis. We identified a significant inflation rate of protein missing values as multiple batches are integrated and show that this pattern is aggravated at the peptide level. We also show that without normalization strategies to address the batch effects, the high precision of quantitation within a single multiplexed TMT batch is not reproduced when data from multiple TMT batches are integrated.Further, the incidence of false positives was studied by using Y chromosome peptides as an internal control. The iPSC lines quantified in this data set were derived from both male and female donors, hence the peptides mapped to the Y chromosome should be absent from female lines. Nonetheless, these Y chromosome-specific peptides were consistently detected in the female channels of all TMT batches. We then used the same Y chromosome specific peptides to quantify the level of ion coisolation as well as the effect of primary and secondary reporter ion interference. These results were used to propose solutions to mitigate the limitations of multi-batch TMT analyses. We confirm that including a common reference line in every batch increases precision by facilitating normalization across the batches and we propose experimental designs that minimize the effect of cross population reporter ion interference.
SummaryPHD1 belongs to the family of prolyl-4-hydroxylases (PHDs) that is responsible for posttranslational modification of prolines on specific target proteins. Because PHD activity is sensitive to oxygen levels and certain byproducts of the tricarboxylic acid cycle, PHDs act as sensors of the cell’s metabolic state. Here, we identify PHD1 as a critical molecular link between oxygen sensing and cell-cycle control. We show that PHD1 function is required for centrosome duplication and maturation through modification of the critical centrosome component Cep192. Importantly, PHD1 is also required for primary cilia formation. Cep192 is hydroxylated by PHD1 on proline residue 1717. This hydroxylation is required for binding of the E3 ubiquitin ligase SCFSkp2, which ubiquitinates Cep192, targeting it for proteasomal degradation. By modulating Cep192 levels, PHD1 thereby affects the processes of centriole duplication and centrosome maturation and contributes to the regulation of cell-cycle progression.
Membraneless organelles are sites for RNA biology including small non-coding RNA (ncRNA) mediated gene silencing. How small ncRNAs utilise phase separated environments for their function is unclear. We investigated how the PIWI-interacting RNA (piRNA) pathway engages with the membraneless organelle P granule in Caenorhabditis elegans. Proteomic analysis of the PIWI protein PRG-1 reveals an interaction with the constitutive P granule protein DEPS-1. DEPS-1 is not required for piRNA biogenesis but piRNA-dependent silencing: deps-1 mutants fail to produce the secondary endo-siRNAs required for the silencing of piRNA targets. We identify a motif on DEPS-1 which mediates a direct interaction with PRG-1. DEPS-1 and PRG-1 form intertwining clusters to build elongated condensates in vivo which are dependent on the Piwi-interacting motif of DEPS-1. Additionally, we identify EDG-1 as an interactor of DEPS-1 and PRG-1. Our study reveals how specific protein-protein interactions drive the spatial organisation and piRNA-dependent silencing within membraneless organelles.
Proteomics studies typically analyze proteins at a population level, using extracts prepared from tens of thousands to millions of cells. The resulting measurements correspond to average values across the cell population and can mask considerable variation in protein expression and function between individual cells or organisms. Here, we report the development of micro‐proteomics for the analysis of Caenorhabditis elegans, a eukaryote composed of 959 somatic cells and ∼1500 germ cells, measuring the worm proteome at a single organism level to a depth of ∼3000 proteins. This includes detection of proteins across a wide dynamic range of expression levels (>6 orders of magnitude), including many chromatin‐associated factors involved in chromosome structure and gene regulation. We apply the micro‐proteomics workflow to measure the global proteome response to heat‐shock in individual nematodes. This shows variation between individual animals in the magnitude of proteome response following heat‐shock, including variable induction of heat‐shock proteins. The micro‐proteomics pipeline thus facilitates the investigation of stochastic variation in protein expression between individuals within an isogenic population of C. elegans. All data described in this study are available online via the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd), an open access, searchable database resource.
Human disease phenotypes are ultimately driven primarily by alterations in protein expression and/or function. To date, relatively little is known about the variability of the human proteome in populations and how this relates to variability in mRNA expression and to disease loci. Here, we present the first comprehensive proteomic analysis of human induced pluripotent stem cells (iPSC), a key cell type for disease modelling, analysing 202 iPSC lines derived from 151 donors, with integrated transcriptome and genomic sequence data from the same lines. We characterised the major genetic and non-genetic determinants of proteome variation across iPSC lines and assessed key regulatory mechanisms affecting variation in protein abundance. We identified 654 protein quantitative trait loci (pQTLs) in iPSCs, including disease-linked variants in protein coding sequences and variants with trans regulatory effects. These include pQTL linked to GWAS variants that cannot be detected at the mRNA level, highlighting the utility of dissecting pQTL at peptide level resolution.
Induced pluripotent stem cell (iPSC) technology has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterisation of many existing iPSC lines limits their potential use for research and therapy. Here, we describe the systematic generation, genotyping and phenotyping of 522 open access human iPSCs derived from 189 healthy male and female individuals as part of the Human Induced Pluripotent Stem Cells Initiative (HipSci:http://www.hipsci.org). Our study provides a comprehensive picture of the major sources of genetic and phenotypic variation in iPSCs and establishes their suitability for use in genetic studies of complex human traits and cancer. Using a combination of genomewide analyses we find that 5-25% of the variation in different iPSC phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. We also assess the phenotypic effects of rare, genomic copy number mutations that are recurrently seen following iPSC reprogramming and present an initial map of common regulatory variants affecting the transcriptome of pluripotent cells in humans.not peer-reviewed)
29Realising the potential of human induced pluripotent stem cell (iPSC) technology for drug 30 discovery, disease modelling and cell therapy requires an understanding of variability across 31 iPSC lines. While previous studies have characterized iPS cell lines genetically and 32 transcriptionally, little is known about the variability of the iPSC proteome. Here, we present 33 the first comprehensive proteomic iPSC dataset, analysing 202 iPSC lines derived from 151 34 donors. We characterise the major genetic determinants affecting proteome and transcriptome 35 variation across iPSC lines and identify key regulatory mechanisms affecting variation in 36 protein abundance. Our data identified >700 human iPSC protein quantitative trait loci 37 (pQTLs). We mapped trans regulatory effects, identifying an important role for protein-protein 38 interactions. We discovered that pQTLs show increased enrichment in disease-linked GWAS 39 variants, compared with RNA-based eQTLs. 40 41 42 Induced pluripotent stem cells (iPSC) hold enormous promise for advancing basic research 43 and biomedicine. By enabling the in vitro reconstitution of development and cell differentiation, 44 iPS cells allow the investigation of mechanisms underlying development and the aetiology of 45 many forms of genetic disease. To realize this potential, it is essential to characterize how 46 genetic and non-genetic effects in human iPSCs influence molecular and cellular phenotypes. 47 48 Recently, the establishment of population reference panels of normal human iPSC lines 1-3 49 have provided valuable resources for functional experiments in different genetic backgrounds.50 Additionally, these data have yielded detailed characterizations of the iPS transcriptome, 51 identifying thousands of cis expression Quantitative Trait Loci (eQTL) 1,4,5 , including at disease-52 relevant loci. While these RNA-based analyses are informative for studying mechanisms 53 affecting gene regulation at the transcriptional level, most cellular phenotypes involve 54 mechanisms acting downstream, at the protein level. Evidence in other contexts, including in 55 lymphoblast cell lines and in cancer, point to substantial differences in the genetic regulation 56 of protein and RNA traits, identifying protein QTL 6-9 and assessing the extent of buffering of 57 genetic effects between layers 10,11 . However, existing protein datasets have been limited by 58 scale (i.e. number of samples) or resolution (i.e. number of proteins, availability of RNA data).59 Importantly, no population-scale proteome datasets have been generated from human 60 pluripotent cells. 61 62 Here, we report on the first comprehensive, population-scale, combined proteomics and gene 63 expression analysis in human iPSC lines. Our data comprise matched quantitative proteomic 64 (TMT Mass Spectrometry) and transcriptomic (RNA-seq) profiles of 202 iPSC lines, derived 65 from 151 donors that are part of the HipSci project 1 . We identify both genetic and non-genetic 66 effects causing variability in protein expressi...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.