The hidden Markov model (HMM) has been a workhorse of single molecule data analysis and is now commonly used as a standalone tool in time series analysis or in conjunction with other analyses methods such as tracking. Here we provide a conceptual introduction to an important generalization of the HMM which is poised to have a deep impact across Biophysics: the infinite hidden Markov model (iHMM). As a modeling tool, iHMMs can analyze sequential data without a priori setting a specific number of states as required for the traditional (finite) HMM. While the current literature on the iHMM is primarily intended for audiences in Statistics, the idea is powerful and the iHMM's breadth in applicability outside Machine Learning and Data Science warrants a careful exposition. Here we explain the key ideas underlying the iHMM with a special emphasis on implementation and provide a description of a code we are making freely available. In a companion article, we provide an important extension of the iHMM to accommodate complications such as drift.
We develop a Bayesian nonparametric framework to analyze single molecule FRET (smFRET) data. This framework, a variation on infinite hidden Markov models, goes beyond traditional hidden Markov analysis, which already treats photon shot noise, in three critical ways: (1) it learns the number of molecular states present in a smFRET time trace (a hallmark of nonparametric approaches), (2) it accounts, simultaneously and self-consistently, for photo-physical features of donor and acceptor fluorophores (blinking kinetics, spectral cross-talk, detector quantum efficiency), and (3) it treats background photons. Point 2 is essential in reducing the tendency of nonparametric approaches to overinterpret noisy single molecule time traces and so to estimate states and transition kinetics robust to photophysical artifacts. As a result, with the proposed framework, we obtain accurate estimates of single molecule properties even when the supplied traces are excessively noisy, subject to photoartifacts, and of short duration. We validate our method using synthetic data sets and demonstrate its applicability to real data sets from single molecule experiments on Holliday junctions labeled with conventional fluorescent dyes.
We have developed a highly detailed mathematical model of solute transport in the renal medulla of the rat kidney to study the impact of the structured organization of nephrons and vessels revealed in anatomic studies. The model represents the arrangement of tubules around a vascular bundle in the outer medulla and around a collecting duct cluster in the upper inner medulla. Model simulations yield marked gradients in intrabundle and interbundle interstitial fluid oxygen tension (PO2), NaCl concentration, and osmolality in the outer medulla, owing to the vigorous active reabsorption of NaCl by the thick ascending limbs. In the inner medulla, where the thin ascending limbs do not mediate significant active NaCl transport, interstitial fluid composition becomes much more homogeneous with respect to NaCl, urea, and osmolality. Nonetheless, a substantial PO2 gradient remains, owing to the relatively high oxygen demand of the inner medullary collecting ducts. Perhaps more importantly, the model predicts that in the absence of the three-dimensional medullary architecture, oxygen delivery to the inner medulla would drastically decrease, with the terminal inner medulla nearly completely deprived of oxygen. Thus model results suggest that the functional role of the three-dimensional medullary architecture may be to preserve oxygen delivery to the papilla. Additionally, a simulation that represents low medullary blood flow suggests that the separation of thick limbs from the vascular bundles substantially increases the risk of the segments to hypoxic injury. When nephrons and vessels are more homogeneously distributed, luminal PO2 in the thick ascending limb of superficial nephrons increases by 66% in the inner stripe. Furthermore, simulations predict that owing to the Bohr effect, the presumed greater acidity of blood in the interbundle regions, where thick ascending limbs are located, relative to that in the vascular bundles, facilitates the delivery of O2 to support the high metabolic requirements of the thick limbs and raises NaCl reabsorption.
Bayesian nonparametric methods have recently transformed emerging areas within data science. One such promising method, the infinite hidden Markov model (iHMM), generalizes the HMM that itself has become a workhorse in single molecule data analysis. The iHMM goes beyond the HMM by self-consistently learning all parameters learned by the HMM in addition to learning the number of states without recourse to any model selection steps. Despite its generality, simple features (such as drift), common to single molecule time traces, result in an overinterpretation of drift and the introduction of artifact states. Here we present an adaptation of the iHMM that can treat data with drift originating from one or many traces (e.g., Förster resonance energy transfer). Our fully Bayesian method couples the iHMM to a continuous control process (drift) self-consistently learned while learning all other quantities determined by the iHMM (including state numbers). A key advantage of this method is that all traces-regardless of drift or states visited across traces-may now be treated on an equal footing, thereby eliminating user-dependent trace selection (based on drift levels), preprocessing to remove drift, and postprocessing model selection based on state number.
Single-molecule localization microscopy has the ability to measure spatial proximity between individual molecules with tens of nanometers precision. Extracting meaningful biological results, however, requires fully characterizing the distribution of molecular behaviors, which in turn, necessitates analyzing large numbers of individual measurements. Making large numbers of replicate measurements in a single imaging session has been made possible in recent years by large area detectors that afford an ultrawide field-of-view as well as fast frame rates. A remaining barrier to ultrawide-field imaging is that optical aberrations become pronounced when imaging far away from the central optical axis, which can compromise the precision and accuracy of point-spreadfunction (PSF) fitting across the field-of-view. Here, we present a computational phase retrieval routine based on vectorial PSF models to account for the spatially-variant aberrations in two color channels of a 3D singlemolecule localization microscope. By computationally correcting the aberrations during data post-processing, we are able to localize emitters in an ultrawide filed-of-view with improved precision and accuracy compared to approaches based on analytical PSF models. The use of a spatially-variant PSF model enables accurate emitter localization in x, y and z over the entire field-of-view, so that the reconstructed super-resolution images and singlemolecule trajectories accurately reproduce the relative spatial arrangement among all localized emitters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.