Recent advances in optogenetics have enabled simultaneous optical perturbation and optical readout of membrane potential in diverse cell types. Here, we develop and characterize a Cre-dependent transgenic Optopatch2 mouse line that we call Floxopatch. The animals expressed a blue-shifted channelrhodopsin, CheRiff, and a near infrared Archaerhodopsin-derived voltage indicator, QuasAr2, via targeted knock-in at the rosa26 locus. In Optopatch-expressing animals, we tested for overall health, genetically targeted expression, and function of the optogenetic components. In offspring of Floxopatch mice crossed with a variety of Cre driver lines, we observed spontaneous and optically evoked activity in vitro in acute brain slices and in vivo in somatosensory ganglia. Cell-type-specific expression allowed classification and characterization of neuronal subtypes based on their firing patterns. The Floxopatch mouse line is a useful tool for fast and sensitive characterization of neural activity in genetically specified cell types in intact tissue.
SummaryHuman induced pluripotent stem cell (iPSC)-derived neurons are an attractive substrate for modeling disease, yet the heterogeneity of these cultures presents a challenge for functional characterization by manual patch-clamp electrophysiology. Here, we describe an optimized all-optical electrophysiology, “Optopatch,” pipeline for high-throughput functional characterization of human iPSC-derived neuronal cultures. We demonstrate the method in a human iPSC-derived motor neuron (iPSC-MN) model of amyotrophic lateral sclerosis (ALS). In a comparison of iPSC-MNs with an ALS-causing mutation (SOD1 A4V) with their genome-corrected controls, the mutants showed elevated spike rates under weak or no stimulus and greater likelihood of entering depolarization block under strong optogenetic stimulus. We compared these results with numerical simulations of simple conductance-based neuronal models and with literature results in this and other iPSC-based models of ALS. Our data and simulations suggest that deficits in slowly activating potassium channels may underlie the changes in electrophysiology in the SOD1 A4V mutation.
Summary Proteins often accumulate neutral mutations that do not affect current functions but can profoundly influence future mutational possibilities and functions. Understanding such hidden potential has major implications for protein design and evolutionary forecasting, but has been limited by a lack of systematic efforts to identify potentiating mutations. Here, through the comprehensive analysis of a bacterial toxin-antitoxin system, we identified all possible single substitutions in the toxin that enable it to tolerate otherwise interface-disrupting mutations in its antitoxin. Strikingly, the majority of enabling mutations in the toxin do not contact, and promote tolerance non-specifically to, many different antitoxin mutations, despite covariation in homologs occurring primarily between specific pairs of contacting residues across the interface. In addition, the enabling mutations we identified expand future mutational paths that both maintain old toxin-antitoxin interactions and form new ones. These non-specific mutations are missed by widely used covariation and machine learning methods. Identifying such enabling mutations will be critical for ensuring continued binding of therapeutically relevant proteins, such as antibodies, aimed at evolving targets.
Human induced pluripotent stem cell (iPSC)-derived neurons are an attractive substrate for modeling disease, yet the heterogeneity of these cultures presents a challenge for functional characterization by manual patch clamp electrophysiology. Here we describe an optimized all-optical electrophysiology, "Optopatch", pipeline for high-throughput functional characterization of human iPSC-derived neuronal cultures. We demonstrate the method in a human iPSC-derived motor neuron model of ALS. In a comparison of neurons with an ALS-causing mutation (SOD1 A4V) with their genome-corrected controls, the mutants showed elevated spike rates under weak or no stimulus, and greater likelihood of entering depolarization block under strong optogenetic stimulus. We compared these results to numerical simulations of simple conductance-based neuronal models and to literature results in this and other iPSC-based models of ALS. Our data and simulations suggest that deficits in slowly activating potassium channels may underlie the changes in electrophysiology in the SOD1 A4V mutation. Scott Linderman provided an optimization algorithm for fitting piecewise linear functions. Vicente Parot helped with PCA-ICA segmentation. Samouil Farhi performed patch clamp electrophysiology measurements.
Generative probabilistic modeling of biological sequences has widespread existing and potential use across biology and biomedicine, particularly given advances in high-throughput sequencing, synthesis and editing. However, we still lack methods with nucleotide resolution that are tractable at the scale of whole genomes and that can achieve high predictive accuracy either in theory or practice. In this article we propose a new generative sequence model, the Bayesian embedded autoregressive (BEAR) model, which uses a parametric autoregressive model to specify a conjugate prior over a nonparametric Bayesian Markov model. We explore, theoretically and empirically, applications of BEAR models to a variety of statistical problems including density estimation, robust parameter estimation, goodness-of-fit tests, and two-sample tests. We prove rigorous asymptotic consistency results including nonparametric posterior concentration rates. We scale inference in BEAR models to datasets containing tens of billions of nucleotides. On genomic, transcriptomic, and metagenomic sequence data we show that BEAR models provide large increases in predictive performance as compared to parametric autoregressive models, among other results. BEAR models offer a flexible and scalable framework, with theoretical guarantees, for building and critiquing generative models at the whole genome scale.
Generative probabilistic models of biological sequences have widespread existing and potential applications in analyzing, predicting and designing proteins, RNA and genomes. To test the predictions of such a model experimentally, the standard approach is to draw samples, and then synthesize each sample individually in the laboratory. However, often orders of magnitude more sequences can be experimentally assayed than can affordably be synthesized individually. In this article, we propose instead to use stochastic synthesis methods, such as mixed nucleotides or trimers. We describe a black-box algorithm for optimizing stochastic synthesis protocols to produce approximate samples from any target generative model. We establish theoretical bounds on the method’s performance, and validate it in simulation using held-out sequence-to-function predictors trained on real experimental data. We show that using optimized stochastic synthesis protocols in place of individual synthesis can increase the number of hits in protein engineering efforts by orders of magnitude, e.g. from zero to a thousand.
Understanding the consequences of mutation for molecular fitness and function is a fundamental problem in biology. Recently, generative probabilistic models have emerged as a powerful tool for estimating fitness from evolutionary sequence data, with accuracy sufficient to predict both laboratory measurements of function and disease risk in humans, and to design novel functional proteins. Existing techniques rest on an assumed relationship between density estimation and fitness estimation, a relationship that we interrogate in this article. We prove that fitness is not identifiable from observational sequence data alone, placing fundamental limits on our ability to disentangle fitness landscapes from phylogenetic history. We show on real datasets that perfect density estimation in the limit of infinite data would, with high confidence, result in poor fitness estimation; current models perform accurate fitness estimation because of, not despite, misspecification. Our results challenge the conventional wisdom that bigger models trained on bigger datasets will inevitably lead to better fitness estimation, and suggest novel estimation strategies going forward.
Large-scale sequencing has revealed extraordinary diversity among biological sequences, produced over the course of evolution and within the lifetime of individual organisms. Existing methods for building statistical models of sequences often preprocess the data using multiple sequence alignment, an unreliable approach for many genetic elements (antibodies, disordered proteins, etc.) that is subject to fundamental statistical pathologies. Here we introduce a structured emission distribution (the MuE distribution) that accounts for mutational variability (substitutions and indels) and use it to construct generative and predictive hierarchical Bayesian models (H-MuE models). Our framework enables the application of arbitrary continuous-space vector models (e.g. linear regression, factor models, image neural-networks) to unaligned sequence data. Theoretically, we show that the MuE generalizes classic probabilistic alignment models. Empirically, we show that H-MuE models can infer latent representations and features for immune repertoires, predict functional unobserved members of disordered protein families, and forecast the future evolution of pathogens.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.