2024
DOI: 10.1101/2024.01.02.573913
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Genomics 2 Proteins portal: A resource and discovery tool for linking genetic screening outputs to protein sequences and structures

Seulki Kwon,
Jordan Safer,
Duyen T. Nguyen
et al.

Abstract: Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics technologies have enabled the detection and generation of variants at an unprecedented scale. However, efficient tools and resources are needed to link these two disparate data types – to “map” variants onto protein structures, to better understand how the variation causes disease and thereby design therapeutics. Here we present the Genomics 2 Proteins Porta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 84 publications
(192 reference statements)
0
1
0
Order By: Relevance
“…By comparing the embedding representations of all 880 HCN1 missense variants to a set of protein features collected from UniProt (25) and the Genomics 2 Proteins Portal (26), we find that embeddings of the subset of variants that are located in disordered regions of the protein (397 variants) are clustered in the embedding space of ESM-1b, but not that of ESM-2, when visually inspecting the first two principle components of each embedding space (see Figure 2 B). Interestingly, 100% of the 397 variants located in the disordered regions are putatively benign population variants from the gnomAD database.…”
Section: Introductionmentioning
confidence: 99%
“…By comparing the embedding representations of all 880 HCN1 missense variants to a set of protein features collected from UniProt (25) and the Genomics 2 Proteins Portal (26), we find that embeddings of the subset of variants that are located in disordered regions of the protein (397 variants) are clustered in the embedding space of ESM-1b, but not that of ESM-2, when visually inspecting the first two principle components of each embedding space (see Figure 2 B). Interestingly, 100% of the 397 variants located in the disordered regions are putatively benign population variants from the gnomAD database.…”
Section: Introductionmentioning
confidence: 99%