2007
DOI: 10.1093/nar/gkl842
|View full text |Cite
|
Sign up to set email alerts
|

NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

Abstract: NCBI's reference sequence (RefSeq) database () is a curated non-redundant collection of sequences representing genomes, transcripts and proteins. The database includes 3774 organisms spanning prokaryotes, eukaryotes and viruses, and has records for 2 879 860 proteins (RefSeq release 19). RefSeq records integrate information from multiple sources, when additional data are available from those sources and therefore represent a current description of the sequence and its features. Annotations include coding regio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

8
2,320
1
10

Year Published

2008
2008
2019
2019

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 3,326 publications
(2,393 citation statements)
references
References 15 publications
8
2,320
1
10
Order By: Relevance
“…Then, we used dbSNP rsIDs for each nSNP (http://www.ncbi.nlm. nih.gov/projects/SNP/) to generate a RefSeq identifier (Pruitt et al 2007). This information was used to map each nSNP onto the 44-species protein alignments available in the UCSC Genome Browser (Kuhn et al 2009).…”
Section: Methodsmentioning
confidence: 99%
“…Then, we used dbSNP rsIDs for each nSNP (http://www.ncbi.nlm. nih.gov/projects/SNP/) to generate a RefSeq identifier (Pruitt et al 2007). This information was used to map each nSNP onto the 44-species protein alignments available in the UCSC Genome Browser (Kuhn et al 2009).…”
Section: Methodsmentioning
confidence: 99%
“…The genomic, mRNA and protein sequences were retrieved from the NCBI RefSeq (www.ncbi.nlm.nih.gov/projects/RefSeq) collection [Pruitt et al, 2007] and they are accessible for downloading in FASTA (text-based) format, while the variant records are downloadable in Excel (Microsoft, Redmond, WA) format. The short descriptions of diseases are organized according to their association with intermediate filament type and links are included to the respective protein information page, to references in the HIFD, as well to the Online Mendelian Inheritance in Man (OMIM) database.…”
Section: Database Content and Organizationmentioning
confidence: 99%
“…Using the full-length sequences of S. cerevisiae Opy2p and Ste50p, we assembled a set of 30 Opy2p-like protein sequences and a set of 28 Ste50p-like protein sequences from fungal species by performing PSI-Blast searches (Altschul et al, 1997) against the NCBI RefSeq database (Pruitt et al, 2007). Multiple sequence alignments within each set were derived with the L-INS-i iterative refinement method of the MAFFT 6 program (Katoh and Toh, 2008), and analyzed with Jalview 2.4 (Waterhouse et al, 2009).…”
Section: Bioinformatic Analysis Of Ste50p-ra Domain and The Fid Regiomentioning
confidence: 99%