Divyanshi Srivastava scite author profile

While Hox genes encode for conserved transcription factors (TFs), they are further divided into anterior, central, and posterior groups based on their DNA-binding domain similarity. The posterior Hox group expanded in the deuterostome clade and patterns caudal and distal structures. We aim to address how similar HOX TFs diverge to induce different positional identities. We studied HOX TF DNA-binding and regulatory activity during an in vitro motor neuron differentiation system that recapitulates embryonic development. We find diversity in the genomic binding profiles of different HOX TFs, even among the posterior group paralogs that share similar DNA binding domains. These differences in genomic binding are explained by differing abilities to bind to previously inaccessible sites. For example, the posterior group HOXC9 has a greater ability to bind occluded sites than the posterior HOXC10, producing different binding patterns and driving differential gene expression programs. From these results, we propose that the differential abilities of posterior HOX TFs to bind to previously inaccessible chromatin drive patterning diversification.

show abstract

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding

Cochran

Srivastava

Shrikumar

et al. 2022

Genome Res.

View full text Add to dashboard Cite

The intrinsic DNA sequence preferences and cell-type specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell-type specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species-specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results demonstrate that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.

show abstract

Differential abilities to engage inaccessible chromatin diversify vertebrate HOX binding patterns

Bulajić

Srivastava

et al. 2019

Preprint

View full text Add to dashboard Cite

While Hox genes encode for conserved transcription factors (TFs), they are further divided into anterior, central, and posterior groups based on their DNA-binding domain similarity. The posterior group expanded in the deuterostome clade and patterns caudal and distal vertebrate structures such as the spinal neuronal diversity required for motor function. Our data revealed that limb-level patterning central Hoxc6, Hoxc8 and posterior Hoxc10 have a reduced ability to access occluded sites compared to other tested Hox TFs. Thus, their genomic binding relies more on cell-specific chromatin accessibility. Although posterior Hoxc9, Hoxc10 and Hoxc13 induce different fates, they share motif preference. However, Hoxc9 and Hoxc13 have a unique ability to access sites occluded by chromatin, resulting in divergent genomic binding patterns.From these results, we propose that the differential abilities of posterior Hox TFs to bind to previously inaccessible chromatin is the predominant force driving their patterning diversification.

show abstract

An interpretable bimodal neural network characterizes the sequence and preexisting chromatin predictors of induced transcription factor binding

et al. 2021

View full text Add to dashboard Cite

Background Transcription factor (TF) binding specificity is determined via a complex interplay between the transcription factor’s DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with transcription factor binding in a given cell type have been well characterized. For instance, the binding sites for a majority of transcription factors display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding activities of the transcription factor itself and thus provide limited insight into how genome-wide TF-DNA binding patterns became established in the first place. To understand the determinants of transcription factor binding specificity, we therefore need to examine how newly activated transcription factors interact with sequence and preexisting chromatin landscapes. Results Here, we investigate the sequence and preexisting chromatin predictors of TF-DNA binding by examining the genome-wide occupancy of transcription factors that have been induced in well-characterized chromatin environments. We develop Bichrom, a bimodal neural network that jointly models sequence and preexisting chromatin data to interpret the genome-wide binding patterns of induced transcription factors. We find that the preexisting chromatin landscape is a differential global predictor of TF-DNA binding; incorporating preexisting chromatin features improves our ability to explain the binding specificity of some transcription factors substantially, but not others. Furthermore, by analyzing site-level predictors, we show that transcription factor binding in previously inaccessible chromatin tends to correspond to the presence of more favorable cognate DNA sequences. Conclusions Bichrom thus provides a framework for modeling, interpreting, and visualizing the joint sequence and chromatin landscapes that determine TF-DNA binding dynamics.

show abstract

Domain adaptive neural networks improve cross-species prediction of transcription factor binding

Cochran

Srivastava

Shrikumar

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

An interpretable bimodal neural network characterizes the sequence and preexisting chromatin predictors of induced TF binding

Srivastava

Aydın

Mazzoni

et al. 2019

Preprint

View full text Add to dashboard Cite

AbstractTranscription factor (TF) binding specificity is determined via a complex interplay between the TF’s DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with TF binding in a given cell type have been well characterized. For instance, the binding sites for a majority of TFs display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding activities of the TF itself, and thus provide limited insight into how genome-wide TF-DNA binding patterns became established in the first place. To understand the determinants of TF binding specificity, we therefore need to examine how newly activated TFs interact with sequence and preexisting chromatin landscapes.Here, we investigate the sequence and preexisting chromatin predictors of TF-DNA binding by examining the genome-wide occupancy of TFs that have been induced in well-characterized chromatin environments. We develop Bichrom, a bimodal neural network that jointly models sequence and preexisting chromatin data to interpret the genome-wide binding patterns of induced TFs. We find that the preexisting chromatin landscape is a differential global predictor of TF-DNA binding; incorporating preexisting chromatin features improves our ability to explain the binding specificity of some TFs substantially, but not others. Furthermore, by analyzing site-level predictors, we show that TF binding in previously inaccessible chromatin tends to correspond to the presence of more favorable cognate DNA sequences. Bichrom thus provides a framework for modeling, interpreting, and visualizing the joint sequence and chromatin landscapes that determine TF-DNA binding dynamics.

show abstract

GenoPipe: identifying the genotype of origin within (epi)genomic datasets

Lang

Srivastava

Pugh

et al. 2023

Preprint

View full text Add to dashboard Cite

Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e., cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g., indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism's genome (i.e., epitope insertions, gene deletions, and SNPs).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.