2016
DOI: 10.1101/051755
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

UMI-tools: Modelling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy

Abstract: Unique Molecular Identifiers (UMIs) are random oligonucleotide barcodes that are increasingly used in high-throughout sequencing experiments. Through a UMI, identical copies arising from distinct molecules can be distinguished from those arising through PCR amplification of the same molecule. However, bioinformatic methods to leverage the information from UMIs have yet to be formalised. In particular, sequencing errors in the UMI sequence are often ignored, or else resolved in an ad-hoc manner. We show that er… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
62
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 55 publications
(62 citation statements)
references
References 22 publications
0
62
0
Order By: Relevance
“…First, we combined all reads for a given single cell using samtools42. Next, we converted read counts to molecule counts using UMI-tools43. UMI-tools counts the number of UMIs at each read start position.…”
Section: Methodsmentioning
confidence: 99%
“…First, we combined all reads for a given single cell using samtools42. Next, we converted read counts to molecule counts using UMI-tools43. UMI-tools counts the number of UMIs at each read start position.…”
Section: Methodsmentioning
confidence: 99%
“…The incorporation of UMIs before PCR allows distinguishing coincidental fragmentation duplicates from true PCR duplicates among reads that align to the same position. However, as reported previously [37], the simplistic approach of allowing only one hit per UMI per position also fails when the read density is high and the number of possible UMIs is low, such that even duplicates with the same genome position and same UMI may occur by chance.…”
Section: Detection Of Duplicate Readsmentioning
confidence: 80%
“…Notably, barcode decomposition is not trivial—particularly for the random sequences of UMIs—as sequencing errors can alter their observed sequences. Methods have been developed to account for this by predicting which barcodes have arisen by error and which truly existed within the sample (Smith et al , ).…”
Section: State‐of‐the‐art Analysis Techniquesmentioning
confidence: 99%