2020
DOI: 10.1186/s13059-020-02105-0
|View full text |Cite
|
Sign up to set email alerts
|

Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery

Abstract: Background The current bovine genomic reference sequence was assembled from a Hereford cow. The resulting linear assembly lacks diversity because it does not contain allelic variation, a drawback of linear references that causes reference allele bias. High nucleotide diversity and the separation of individuals by hundreds of breeds make cattle ideally suited to investigate the optimal composition of variation-aware references. Results We augment the bovine linear refere… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
27
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 42 publications
(29 citation statements)
references
References 87 publications
1
27
0
Order By: Relevance
“…We call this a “rough” upper bound because, while the personalized reference is ideal in that it contains the correct variants, the accuracy of alignment is also affected by tool-specific heuristics. A true upper bound would be hard to obtain, so we settle for the rough upper bound provided by the personalized genome, as in previous work [ 17 , 27 ].…”
Section: Resultsmentioning
confidence: 99%
“…We call this a “rough” upper bound because, while the personalized reference is ideal in that it contains the correct variants, the accuracy of alignment is also affected by tool-specific heuristics. A true upper bound would be hard to obtain, so we settle for the rough upper bound provided by the personalized genome, as in previous work [ 17 , 27 ].…”
Section: Resultsmentioning
confidence: 99%
“…For comparison, we mapped the same reads to GRCh37 with BWA-MEM. Replicating a previous approach 23 , we used and 24 to call variants, and filtered to high-confidence variants (root mean square read mapping quality >= 40 and depth >= 25) that were called as heterozygous for all mappers. For these variants, we found the fraction of reads supporting alternate vs. reference alleles at each indel length (Figure 5(A)).…”
Section: Resultsmentioning
confidence: 99%
“…With the increasing number of livestock genome assemblies and versions, researchers might adopt pangenome references that catalogue structural diversity within a species (e.g. [ 126 , 127 ]) and representations of graph genomes that store such pan-genome information in a single data structure [ 128 ]. Graph genomes allow bioinformatic methods (e.g.…”
Section: Main Textmentioning
confidence: 99%