2011
DOI: 10.1093/nar/gkr1079
|View full text |Cite
|
Sign up to set email alerts
|

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

Abstract: The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected and curated from public sequence archives and represent a significant reduction in redundancy compared to the volume of data archived by the International Nucleotide Sequence Database Collaboration. The database includes over 16 000 organisms, 2.4 × 106 genomic records, 13 × 106 proteins and 2 × 106 RNA records spanning pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
984
1
6

Year Published

2012
2012
2018
2018

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 1,042 publications
(991 citation statements)
references
References 14 publications
0
984
1
6
Order By: Relevance
“…The N-terminal sequence (~ 240 aa) of E. coli MnmC (Uniprot id P77182) was used as a bait to recover bacterial homologous from RefSeq database [28]. A multiple sequence alignment of the MnmC(m) domain with 70 non-redundant sequences was constructed using iterative refinement methods [29].…”
Section: Methodsmentioning
confidence: 99%
“…The N-terminal sequence (~ 240 aa) of E. coli MnmC (Uniprot id P77182) was used as a bait to recover bacterial homologous from RefSeq database [28]. A multiple sequence alignment of the MnmC(m) domain with 70 non-redundant sequences was constructed using iterative refinement methods [29].…”
Section: Methodsmentioning
confidence: 99%
“…http://dx.doi.org/10.1101/022384 doi: bioRxiv preprint first posted online Jul. 12, 2015; (Pruitt et al 2012), a minimum E-value of 1e-6 was used and only best hits were considered. BLAST XML result file was imported in Blast2go.…”
Section: Cc-by-nc-ndmentioning
confidence: 99%
“…BLASTp alignments against the non-redundant RefSeq protein database (22 January 2014) (Ye et al 2006; Pruitt et al 2012), UniProt (22 January 2014) (Consortium 2014) (including Swiss-Prot, fungal taxonomic division and uniref90) and KEGG (Release 69.0, 1 January 2014) (Kanehisa et al 2002) were performed to assign general protein function profiles. Protein domains were assigned using InterProScan 5.2-45.0 (Quevillon et al 2005) (including Pfam 27.0 (Punta et al 2012), SUPERFAMILY 1.75 (Wilson et al 2009), SMART 7 (Letunic et al 2012), TIGRFAMs 13.0 (Haft et al 2013), TMHMM 2.0c (Krogh et al 2001), PROSITE 20.99 (Sigrist et al 2013) and PANTHER 8.1 (Mi et al 2013) databases).…”
Section: Methodsmentioning
confidence: 99%