2013
DOI: 10.1371/journal.pone.0058815
|View full text |Cite
|
Sign up to set email alerts
|

Development of Strategies for SNP Detection in RNA-Seq Data: Application to Lymphoblastoid Cell Lines and Evaluation Using 1000 Genomes Data

Abstract: Next-generation RNA sequencing (RNA-seq) maps and analyzes transcriptomes and generates data on sequence variation in expressed genes. There are few reported studies on analysis strategies to maximize the yield of quality RNA-seq SNP data. We evaluated the performance of different SNP-calling methods following alignment to both genome and transcriptome by applying them to RNA-seq data from a HapMap lymphoblastoid cell line sample and comparing results with sequence variation data from 1000 Genomes. We determin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
104
0
1

Year Published

2014
2014
2019
2019

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 120 publications
(106 citation statements)
references
References 44 publications
1
104
0
1
Order By: Relevance
“…To increase the confidence of somatic mutation calling, a minimum base Phred quality score of 20 [20] is used to filter out low quality reads and bases. A recent report suggests that only >10X coverage is required to ensure 89% accuracy and 92% sensitivity for single nucleotide variations (SNVs) [21]. Thus, a minimum depth of 10 from both tumor data and normal data is required for GLMVC to consider the candidate base.…”
Section: Methodsmentioning
confidence: 99%
“…To increase the confidence of somatic mutation calling, a minimum base Phred quality score of 20 [20] is used to filter out low quality reads and bases. A recent report suggests that only >10X coverage is required to ensure 89% accuracy and 92% sensitivity for single nucleotide variations (SNVs) [21]. Thus, a minimum depth of 10 from both tumor data and normal data is required for GLMVC to consider the candidate base.…”
Section: Methodsmentioning
confidence: 99%
“…These QC steps have been shown to reduce the likelihood of false positives and biases in observed mutant allele frequencies during mutation calling using RNA-seq data. 15 We calculated allelic counts at each single-nucleotide variant position in each sample using the SAMtools mpileup function with default parameters, except for the "-A" option and the "-l" option to specify the list of mutation coordinates, which we took, for each sample, from the list of validated mutations in our prior study. 9 We recorded the number of overlapping sequences containing the mutant and wild-type (WT) allele, as identified in the previous study for each position and sample.…”
mentioning
confidence: 99%
“…Preservation of the reading frame ensures correct calling of gene structure (16–18) and SNPs, whether in, for example, the realignment phase of an exome sequencing pipeline (19), single-nucleotide polymorphism (SNP) calling from existing RNA-seq data (20,21) or amplicon-based analyses such as human immunodeficiency virus (HIV) drug resistance genotyping (22). When aligning coding DNA generated using high-throughput sequencing platforms, it is critical that codons present in the open reading frame remain intact and that codon-sized insertions and deletions are recognized and called correctly, as distinct from both genuine frameshifts and single indels created through sequencing error.…”
Section: Introductionmentioning
confidence: 99%