2020
DOI: 10.3390/microorganisms8040549
|View full text |Cite
|
Sign up to set email alerts
|

Biological Machine Learning Combined with Campylobacter Population Genomics Reveals Virulence Gene Allelic Variants Cause Disease

Abstract: Highly dimensional data generated from bacterial whole-genome sequencing is providing an unprecedented scale of information that requires an appropriate statistical analysis framework to infer biological function from populations of genomes. The application of genome-wide association study (GWAS) methods is an appropriate framework for bacterial population genome analysis that yields a list of candidate genes associated with a phenotype, but it provides an unranked measure of importance. Here, we validated a n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
16
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

3
5

Authors

Journals

citations
Cited by 16 publications
(19 citation statements)
references
References 31 publications
(47 reference statements)
0
16
0
Order By: Relevance
“…In addition to taxonomic identification, genetic diversity, and virulence traits can be infer for disease potential. Additionally, allelic variants of core genes can be determined for disease presentation [15]. This is particularly useful in genomically diverse organisms with open pan-genomes and are common in the microbiome [15].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition to taxonomic identification, genetic diversity, and virulence traits can be infer for disease potential. Additionally, allelic variants of core genes can be determined for disease presentation [15]. This is particularly useful in genomically diverse organisms with open pan-genomes and are common in the microbiome [15].…”
Section: Introductionmentioning
confidence: 99%
“…Additionally, allelic variants of core genes can be determined for disease presentation [15]. This is particularly useful in genomically diverse organisms with open pan-genomes and are common in the microbiome [15]. Previously, the genome sequence diversity of Hungatella was recognized to be larger than initially contemplated and led to the suggestion that the type strain is not representative of this species, as defined by the Human Microbiome Project [16].…”
Section: Introductionmentioning
confidence: 99%
“…For example, the resulting gene presence and absence information enables GWAS analysis in bacteria, where pangenome wide association studies (Pan-GWAS) with large populations of bacterial genomes serve as a method for the identification of genes associated with target phenotypes [ 30 ]. This Pan-GWAS approach was recently used to examine genetic deletions and successfully verified that only specific alleles of a core genome toxin gene ( porA ) caused abortion in livestock with Campylobacter jejuni infections with very high accuracy based on the population genomics approach [ 31 ]. While this strategy identifies genes of interest using widely accepted methods in statistical genetics, the implementation of machine learning algorithms to generate predictive models and identify genes with a ranked order of variable importance (VI) for targeted phenotypes is increasingly useful and provides an analytical method without a priori knowledge of the genes responsible for a trait [ 32 ].…”
Section: Introductionmentioning
confidence: 99%
“…As with all modern science, the use of in silico bioinformatics approaches to study relevant research questions goes hand in hand with laboratory science. Bandoy and Weimer [19] use machine learning combined with Campylobacter population genomics to reveal virulence gene allelic variants that cause disease. The authors validated a novel framework to define infection mechanism using a combination of a GWAS, machine learning and bacterial population genomics that ranked allelic variants to identify disease.…”
mentioning
confidence: 99%