DJ Darwin Bandoy scite author profile

DJ Darwin Bandoy

4Publications

48Citation Statements Received

167Citation Statements Given

How they've been cited

How they cite others

145

162

Affiliations

University of the Philippines Los Baños, University of California, Davis

Publications

Order By: Most citations

Biological Machine Learning Combined with Campylobacter Population Genomics Reveals Virulence Gene Allelic Variants Cause Disease

Bandoy

Weimer

2020

Microorganisms

View full text Add to dashboard Cite

Highly dimensional data generated from bacterial whole-genome sequencing is providing an unprecedented scale of information that requires an appropriate statistical analysis framework to infer biological function from populations of genomes. The application of genome-wide association study (GWAS) methods is an appropriate framework for bacterial population genome analysis that yields a list of candidate genes associated with a phenotype, but it provides an unranked measure of importance. Here, we validated a novel framework to define infection mechanism using the combination of GWAS, machine learning, and bacterial population genomics that ranked allelic variants that accurately identified disease. This approach parsed a dataset of 1.2 million single nucleotide polymorphisms (SNPs) and indels that resulted in an importance ranked list of associated alleles of porA in Campylobacter jejuni using spatiotemporal analysis over 30 years. We validated this approach using previously proven laboratory experimental alleles from an in vivo guinea pig abortion model. This framework, termed μPathML, defined intestinal and extraintestinal groups that have differential allelic porA variants that cause abortion. Divergent variants containing indels that defeated automated annotation were rescued using biological context and knowledge that resulted in defining rare, divergent variants that were maintained in the population over two continents and 30 years. This study defines the capability of machine learning coupled with GWAS and population genomics to simultaneously identify and rank alleles to define their role in infectious disease mechanisms.

show abstract

Phylogenetic and Biogeographic Patterns of Vibrio parahaemolyticus Strains from North America Inferred from Whole-Genome Sequence Data

Miller

Weimer

Timme

et al. 2021

Appl Environ Microbiol

View full text Add to dashboard Cite

Vibrio parahaemolyticus is the most common cause of seafood-borne illness reported in the United States. Draft genomes of 132 North American clinical and oyster V. parahaemolyticus isolates were sequenced to investigate their phylogenetic and biogeographic relationships. The majority of oyster isolate sequence types (STs) were from a single harvest location; however, four were identified from multiple locations. There was population structure along the Gulf and Atlantic Coasts of North America, with what seemed to be a hub of genetic variability along the Gulf Coast with some of the same STs occurring along the Atlantic Coast and one shared between the coastal waters of the Gulf and those of Washington state. Phylogenetic analyses found nine well-supported clades. Two clades were composed of isolates from both clinical and oyster sources. Four were composed entirely from clinical sources and three entirely from oyster sources. Each single source clade consisted of one ST. Some human isolates lack tdh and trh and some T3SS genes, which are established virulence genes of V. parahaemolyticus. Thus, these genes are not essential for pathogenicity. However, isolates in the monophyletic groups from clinical sources were enriched in several categories of genes when compared to those from monophyletic groups of oyster isolates. These functional categories include: cell signaling, transport, and metabolism. Identification of genes in these functional categories provides a basis for future in-depth pathogenicity investigations of V. parahaemolyticus. IMPORTANCE Vibrio parahaemolyticus is the most common cause of seafood-borne illness reported in the United States and is frequently associated with shellfish consumption. This study contributes to our knowledge of the biogeography and functional genomics of this species around North America. STs shared between the Gulf Coast and the Atlantic seaboard as well as Pacific waters suggests possible transport via oceanic currents or large shipping vessels. STs frequently isolated from humans, but rarely if ever from the environment, are likely more competitive in the human gut compared to other STs. This could be due to additional functional capabilities in areas like cell signaling, transport, and metabolism which may give these isolates an advantage in novel nutrient replete environments like the human gut.

show abstract

Pangenome guided pharmacophore modelling of enterohemorrhagic Escherichia coli sdiA

Bandoy

2019

F1000Res

View full text Add to dashboard Cite

Enterohemorrhagic Escherichia coli (EHEC) continues to be a significant public health risk. With the onset of next generation sequencing, whole genome sequences are a potential resource for predictive modelling of the different regulatory mechanism of pathogens, particularly quorum sensing. We used a pangenome approach to determine EHEC genome clustering, determine the synonymous and nonsynonymous mutations across the EHEC sdiA and modelled the associated amino acid changes. Across the EHEC population, nonsynonymous variants are notably absent in ligand binding site for quorum sensing, indicating that population wide conservation of sdiA ligand site can be targeted for potential prophylactic purposes. Applying pathotype-wide pangenomics as a guide for determining evolution of pharmacophore sites is a potential approach in drug discovery.

show abstract

Pandemic dynamics of COVID-19 using epidemic stage, instantaneous reproductive number and pathogen genome identity (GENI) score: modeling molecular epidemiology

Bandoy

Weimer

2020

Preprint

View full text Add to dashboard Cite

Background: Global spread of COVID-19 created an unprecedented infectious disease crisis that progressed to a pandemic with >180,000 cases in >100 countries. Reproductive number (R) is an outbreak metric estimating the transmission of a pathogen. Initial R values were published based on the early outbreak in China with limited number of cases with whole genome sequencing. Initial comparisons failed to show a direct relationship viral genomic diversity and epidemic severity was not established for SARS-Cov-2. Methods: Each country's COVID-19 outbreak status was classified according to epicurve stage (index, takeoff, exponential, decline). Instantaneous R estimates (Wallinga and Teunis method) with a short and standard serial interval examined asymptomatic spread. Whole genome sequences were used to quantify the pathogen genome identity score that were used to estimate transmission time and epicurve stage. Transmission time was estimated based on evolutionary rate of 2 mutations/month. Findings: The country-specific R revealed variable infection dynamics between and within outbreak stages. Outside China, R estimates revealed propagating epidemics poised to move into the takeoff and exponential stages. Population density and local temperatures had variable relationship to the outbreaks. GENI scores differentiated countries in index stage with cryptic transmission. Integration of incidence data with genome variation directly increases in cases with increased genome variation. Interpretation: R was dynamic for each country and during the outbreak stage. Integrating the outbreak dynamic, dynamic R, and genome variation found a direct association between cases and genome variation. Synergistically, GENI provides an evidence-based transmission metric that can be determined by sequencing the virus from each case. We calculated an instantaneous country-specific R at different stages of outbreaks and formulated a novel metric for infection dynamics using viral genome sequences to capture gaps in untraceable transmission. Integrating epidemiology with genome sequencing allows evidence-based dynamic disease outbreak tracking with predictive evidence.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

DJ Darwin Bandoy

Biological Machine Learning Combined with Campylobacter Population Genomics Reveals Virulence Gene Allelic Variants Cause Disease

Phylogenetic and Biogeographic Patterns of Vibrio parahaemolyticus Strains from North America Inferred from Whole-Genome Sequence Data

Pangenome guided pharmacophore modelling of enterohemorrhagic Escherichia coli sdiA

Pandemic dynamics of COVID-19 using epidemic stage, instantaneous reproductive number and pathogen genome identity (GENI) score: modeling molecular epidemiology

Contact Info

Product

Resources

About