2022
DOI: 10.1002/ggn2.202100065
|View full text |Cite
|
Sign up to set email alerts
|

Recovering High‐Quality Host Genomes from Gut Metagenomic Data through Genotype Imputation

Abstract: Metagenomic datasets of host‐associated microbial communities often contain host DNA that is usually discarded because the amount of data is too low for accurate host genetic analyses. However, genotype imputation can be employed to reconstruct host genotypes if a reference panel is available. Here, the performance of a two‐step strategy is tested to impute genotypes from four types of reference panels built using different strategies to low‐depth host genome data (≈2× coverage) recovered from intestinal sampl… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 91 publications
0
6
0
Order By: Relevance
“…However this is not necessarily detrimental. If there is combined interest in the host genome and microbiota, sequencing the whole gut or the squeezed gut may be a good option to retrieve whole host genomes along with the microbial fraction in one sequencing run (Marcos et al, 2022).…”
Section: Discussionmentioning
confidence: 99%
“…However this is not necessarily detrimental. If there is combined interest in the host genome and microbiota, sequencing the whole gut or the squeezed gut may be a good option to retrieve whole host genomes along with the microbial fraction in one sequencing run (Marcos et al, 2022).…”
Section: Discussionmentioning
confidence: 99%
“…Although identical library preparation procedures can be used for HG and MG data generation, 23 HT and MT require different strategies for avoiding the domination of ribosomal RNA (rRNA) over messenger RNA (mRNA) in the resulting data. In HT and MT it is often observed that over 90% of sequences belong to rRNA transcripts unless depletion strategies are employed.…”
Section: Laboratory Sample Processingmentioning
confidence: 99%
“…Host genomes often contain thousands of genes, with hundreds of thousands of nucleotide variants. 23 It is also common to generate catalogs of hundreds of bacterial genomes, with millions of genes, as well as metabolomic profiles containing thousands of metabolites. One of the first considerations is to filter these data, to reduce the information to only that required for answering the biological question of interest.…”
Section: Data Filtering Imputation and Distillationmentioning
confidence: 99%
“…Significant advantages of A. mellifera for such a project include its small genome size (225.25 Mb; Amel_HAv3.1) and the availability of haploid drones whose alleles are naturally phased. The resource will be useful for a variety of applications, including for example: imputation of low-depth sequence or SNP array data to enable cost-efficient large-scale studies; accurate phasing of diploid genomes to facilitate haplotype-based analyses such as XP-EHH; 33 recovery of host genomic data from suboptimal samples (such as bee hive products 34 , metagenomic 35 , historic or ancient DNA 14 ); identification of ancestry informative markers or tag-SNPs; validation of reduced SNP panels; and to serve as a comprehensive reference panel to support studies on population and evolutionary genetics.…”
Section: Background and Summarymentioning
confidence: 99%