2022
DOI: 10.1101/2022.09.18.508433
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SHEPHARD: a modular and extensible software architecture for analyzing and annotating large protein datasets

Abstract: The emergence of high-throughput experiments and high-resolution computational predictions has led to an explosion in the quality and volume of protein sequence annotations at proteomic scales. Unfortunately, integrating and analyzing complex sequence annotations remains logistically challenging. Here we present SHEPHARD, a software package that makes large-scale integrative protein bioinformatics trivial. SHEPHARD is provided as a stand-alone package and with a pre-compiled set of human annotations in a Googl… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 44 publications
0
7
0
Order By: Relevance
“…The protein sequence analysis was conducted using SHEPHARD, a Python-based framework designed for integrating and analyzing large-scale amino acid sequence properties 124 . IDRs were predicted and annotated using metapredict (V2) 125 .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The protein sequence analysis was conducted using SHEPHARD, a Python-based framework designed for integrating and analyzing large-scale amino acid sequence properties 124 . IDRs were predicted and annotated using metapredict (V2) 125 .…”
Section: Methodsmentioning
confidence: 99%
“…Protein abundance analyses (Fig. S7–11) were performed using SHEPHARD 124 and sparrow (https://github.com/idptools/sparrow). Mass spectrometry data were obtained for humans 128 , X. laevis 129 , A. thaliana 130 , E. coli 131 , S. pombe 132 , and S. cerevisiae 133 .…”
Section: Methodsmentioning
confidence: 99%
“…To this end, we calculated all homotypic ɛ values for all human IDRs that possess one or more phosphosite (19,703 IDRs) before and after making phosphomimetic mutations. Only experimentally-reported phosphosites (S/T/Y) were used, and in all cases, were converted to E (glutamic acid) ( 67 , 68 ). Interestingly, ∼57% of IDRs that undergo phosphorylation showed a reduction in homotypic attractive interaction upon phosphorylation ( Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Rational sequence designs used for examining homopolymer vs. IDR properties were generated using GOOSE ( 124 ). Proteome-wide analysis was performed using SHEPHARD, with data obtained from UniProt ( 67 , 125 ). We make extensive use of previously published experimental data and are indebted to the authors for their previously published careful biophysical and biochemical studies.…”
Section: Methodsmentioning
confidence: 99%
“…Proteome-wide bioinformatics was performed using SPARROW (https://github.com/idptools/sparrow) and SHEPHARD 66 . SPARROW is an in-development Python package for calculating IDR sequence properties, while SHEPHARD is a hierarchical analysis framework for annotating and analyzing large sets of protein sequences.…”
Section: Bioinformaticsmentioning
confidence: 99%