2021
DOI: 10.15252/msb.202010016
|View full text |Cite
|
Sign up to set email alerts
|

hu.MAP 2.0: integration of over 15,000 proteomic experiments builds a global compendium of human multiprotein assemblies

Abstract: A general principle of biology is the self-assembly of proteins into functional complexes. Characterizing their composition is, therefore, required for our understanding of cellular functions. Unfortunately, we lack knowledge of the comprehensive set of identities of protein complexes in human cells. To address this gap, we developed a machine learning framework to identify protein complexes in over 15,000 mass spectrometry experiments which resulted in the identification of nearly 7,000 physical assemblies. W… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
85
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 96 publications
(90 citation statements)
references
References 66 publications
4
85
0
1
Order By: Relevance
“…To provide the information about known physical protein-protein interactions of each hit with the query, we used the confidence values for interactions from BioGRID ( 23 ) and Hu.Map 2.0 ( 24 ), as well as predictions of protein complexes from Hu.Map 2.0.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…To provide the information about known physical protein-protein interactions of each hit with the query, we used the confidence values for interactions from BioGRID ( 23 ) and Hu.Map 2.0 ( 24 ), as well as predictions of protein complexes from Hu.Map 2.0.…”
Section: Methodsmentioning
confidence: 99%
“…This new web server reveals functional relationships between individual domains beyond the detection based on whole-protein sequences. In addition, DEPCOD introduces a combination of methodological and functional features, including: (i) expanded scope of species in evolutionary profiles at two scales: 244 eukaryotic genomes or 506 genomes from all three domains of life: Eukaryota , Bacteria , and Archaea ; (ii) incorporation of phylogeny of the searched genomes into the correlation of conservation patterns; (iii) visualization of known BioGRID ( 23 ) and hu.MAP ( 24 ) interactions and shared protein complexes between the query and the detected hits; (iv) analysis of GO ( 25 , 26 ), KEGG ( 27 , 28 ) and Reactome ( 29 ) pathway enrichment among the detected hits and (v) visualization of details and sources of detected patterns similarities (conservation values for individual species, taxonomic trees, links to the information about detected domains and domain families, etc).…”
Section: Introductionmentioning
confidence: 99%
“…( 31 ) have used the Complex Portal as a verification dataset in their analysis of potential transcription cofactors. By combining inferred complexes from hu.MAP 2.0 ( 32 ), curated complexes from CORUM ( 33 ) and curated physical interaction data from IntAct [this volume NAR paper] and BioGrid ( 34 ) with a selected set of transcription-related Gene Ontology terms we have identified more than 1500 putative transcription cofactors. 415 of these are already participants of complexes in Complex Portal, and the remaining proteins will be curated into Complex Portal if they are identified as components of complexes.…”
Section: Collaboration and Community Involvementmentioning
confidence: 99%
“…Hence, integrating PPIs across multiple experiments, as for e.g. the networks hu.MAP 1.0 [6] and hu.MAP 2.0 [7] that integrate over 9,000 and 15,000 mass spectrometry experiments respectively from AP/MS [8]- [11] and CF/MS data [12]- [15], can help to mitigate the effects of experimental errors. Combining such approaches with algorithms to cluster proteins and identify complexes from the PPI network should result in more accurate determination of protein complexes.…”
Section: Introductionmentioning
confidence: 99%
“…All three methods, SCI-SVM, SCI-BN, and ClusterSS use a greedy heuristic algorithm for selecting the neighbor to add to the subgraph in the growth process, with ClusterSS considering only the top neighbors by degree for speed improvements. However, since the methods use serial candidate community sampling, this negatively impacts their scalability to large networks like hu.MAP 1.0 [6] with ~8k proteins and ~60k interactions, and hu.MAP 2.0 [7] with ~10k proteins and over 40k interactions. To combat this, Super.Complex (supervised complex detection algorithm) was developed for high scalability and accuracy [24].…”
Section: Introductionmentioning
confidence: 99%