The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2016
DOI: 10.1093/nar/gkw1081
|View full text |Cite
|
Sign up to set email alerts
|

Uniclust databases of clustered and deeply annotated protein sequences and alignments

Abstract: We present three clustered protein sequence databases, Uniclust90, Uniclust50, Uniclust30 and three databases of multiple sequence alignments (MSAs), Uniboost10, Uniboost20 and Uniboost30, as a resource for protein sequence analysis, function prediction and sequence searches. The Uniclust databases cluster UniProtKB sequences at the level of 90%, 50% and 30% pairwise sequence identity. Uniclust90 and Uniclust50 clusters showed better consistency of functional annotation than those of UniRef90 and UniRef50, owi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
508
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 618 publications
(552 citation statements)
references
References 24 publications
(24 reference statements)
1
508
0
Order By: Relevance
“…Protein domains were identified using a representative set of RNA virus genomes, including representative members of ICTV-approved virus families and unclassified virus groups. This set was annotated manually using sensitive profile-profile comparisons with the HHsuite package (127), and hmm profiles for annotated proteins or their domains were generated by running one iteration of HHblits against the latest (October 2017) uniclust30 database (131). Each annotated profile was assigned to a functional category (e.g., Љcapsid protein_jelly-roll,Љ Љchymotrypsinlike proteaseЉ).…”
Section: Methodsmentioning
confidence: 99%
“…Protein domains were identified using a representative set of RNA virus genomes, including representative members of ICTV-approved virus families and unclassified virus groups. This set was annotated manually using sensitive profile-profile comparisons with the HHsuite package (127), and hmm profiles for annotated proteins or their domains were generated by running one iteration of HHblits against the latest (October 2017) uniclust30 database (131). Each annotated profile was assigned to a functional category (e.g., Љcapsid protein_jelly-roll,Љ Љchymotrypsinlike proteaseЉ).…”
Section: Methodsmentioning
confidence: 99%
“…We searched for homologs with HHsearch against the PDB70 database, which has a maximum mutual sequence identity of 70% between proteins deposited in PDB, released on May 23, 2018. Sequence profiles were generated with HHblits by searching homologous sequences with the default options against Uniclust30, which is a clustered UniProtKB database at the level of 30% pairwise sequence identity, released in September 2016. We predicted bound ligands by considering the structural similarity of detected homologs.…”
Section: Methodsmentioning
confidence: 99%
“…Gene models were predicted by BRAKER1 v1.11 (Hoff, Lange, Lomsadze, Borodovsky, & Stanke, 2016) using the soft-masked genome assembly and the STAR alignment file as inputs. Gene models were annotated by querying models against the Uniclust90 database (Mirdita et al, 2017) using MMseqs2 with an e value < 1e −05 . Gene models were annotated by querying models against the Uniclust90 database (Mirdita et al, 2017) using MMseqs2 with an e value < 1e −05 .…”
Section: Gene Model Prediction and Annotationmentioning
confidence: 99%
“…Gene models were annotated by querying models against the Uniclust90 database (Mirdita et al, 2017) using MMseqs2 with an e value < 1e −05 . Gene Ontology (GO) terms associated with the representative UniProtKB sequence for each Uniclust90 hit were attributed to the A. tenebrosa gene model using the idmapping_selected.tab file provided by UniProtKB.…”
Section: Gene Model Prediction and Annotationmentioning
confidence: 99%