Antonio Blanca scite author profile

We consider the problem of sampling from the Potts model on random regular graphs. It is conjectured that sampling is possible when the temperature of the model is in the so-called uniqueness regime of the regular tree, but positive algorithmic results have been for the most part elusive. In this paper, for all integers q ≥ 3 and ∆ ≥ 3, we develop algorithms that produce samples within error o(1) from the q-state Potts model on random ∆-regular graphs, whenever the temperature is in uniqueness, for both the ferromagnetic and antiferromagnetic cases.The algorithm for the antiferromagnetic Potts model is based on iteratively adding the edges of the graph and resampling a bichromatic class that contains the endpoints of the newly added edge. Key to the algorithm is how to perform the resampling step efficiently since bichromatic classes can potentially induce linear-sized components. To this end, we exploit the tree uniqueness to show that the average growth of bichromatic components is typically small, which allows us to use correlation decay algorithms for the resampling step. While the precise uniqueness threshold on the tree is not known for general values of q and ∆ in the antiferromagnetic case, our algorithm works throughout uniqueness regardless of its value.In the case of the ferromagnetic Potts model, we are able to simplify the algorithm significantly by utilising the random-cluster representation of the model. In particular, we demonstrate

show abstract

Spatial Mixing and Non-local Markov chains

Blanca¹,

Caputo²,

Sinclair³

et al. 2018

View full text Add to dashboard Cite

We consider spin systems with nearest-neighbor interactions on an n-vertex d-dimensional cube of the integer lattice graph Z d . We study the effects that exponential decay with distance of spin correlations, specifically the strong spatial mixing condition (SSM), has on the rate of convergence to equilibrium of non-local Markov chains. We prove that SSM implies O(log n) mixing of a block dynamics whose steps can be implemented efficiently. We then develop a methodology, consisting of several new comparison inequalities concerning various block dynamics, that allow us to extend this result to other non-local dynamics. As a first application of our method we prove that, if SSM holds, then the relaxation time (i.e., the inverse spectral gap) of general block dynamics is O(r), where r is the number of blocks. A second application of our technology concerns the Swendsen-Wang dynamics for the ferromagnetic Ising and Potts models. We show that SSM implies an O(1) bound for the relaxation time. As a by-product of this implication we observe that the relaxation time of the Swendsen-Wang dynamics in square boxes of Z 2 is O(1) throughout the subcritical regime of the q-state Potts model, for all q ≥ 2. We also prove that for monotone spin systems SSM implies that the mixing time of systematic scan dynamics is O(log n(log log n) 2 ). Systematic scan dynamics are widely employed in practice but have proved hard to analyze. Our proofs use a variety of techniques for the analysis of Markov chains including coupling, functional analysis and linear algebra.

show abstract

The statistics ofk-mers from a sequence undergoing a simple mutation process without spurious matches

Blanca

Harris

Koslicki

et al. 2021

Preprint

View full text Add to dashboard Cite

K-mer-based methods are widely used in bioinformatics, but there are many gaps in our understanding of their statistical properties. Here, we consider the simple model where a sequence S (e.g. a genome or a read) undergoes a simple mutation process whereby each nucleotide is mutated independently with some probability r, under the assumption that there are no spurious k-mer matches. How does this process affect the k-mers of S? We derive the expectation and variance of the number of mutated k-mers and of the number of islands (a maximal interval of mutated k-mers) and oceans (a maximal interval of non-mutated k-mers). We then derive hypothesis tests and confidence intervals for r given an observed number of mutated k-mers, or, alternatively, given the Jaccard similarity (with or without minhash). We demonstrate the usefulness of our results using a few select applications: obtaining a confidence interval to supplement the Mash distance point estimate, filtering out reads during alignment by Minimap2, and rating long read alignments to a de Bruijn graph by Jabba.

show abstract

On Universal Cycles for new Classes of Combinatorial Structures

Blanca¹,

Godbole²

2011

SIAM J. Discrete Math.

View full text Add to dashboard Cite

A universal cycle (u-cycle) is a compact listing of a collection of combinatorial objects. In this paper, we use natural encodings of these objects to show the existence of u-cycles for collections of subsets, matroids, restricted multisets, chains of subsets, multichains, and lattice paths. For subsets, we show that a u-cycle exists for the k-subsets of an n-set if we let k vary in a non zero length interval. We use this result to construct a "covering" of length (1 + o(1)) n k for all subsets of [n] of size exactly k with a specific formula for the o(1) term. We also show that u-cycles exist for all n-length words over some alphabet Σ, which contain all characters from R ⊂ Σ. Using this result we provide u-cycles for encodings of Sperner families of size 2 and proper chains of subsets.

show abstract

The minimizer Jaccard estimator is biased and inconsistent

Belbasi

Blanca

Harris

et al. 2022

View full text Add to dashboard Cite

Motivation Sketching is now widely used in bioinformatics to reduce data size and increase data processing speed. Sketching approaches entice with improved scalability but also carry the danger of decreased accuracy and added bias. In this article, we investigate the minimizer sketch and its use to estimate the Jaccard similarity between two sequences. Results We show that the minimizer Jaccard estimator is biased and inconsistent, which means that the expected difference (i.e. the bias) between the estimator and the true value is not zero, even in the limit as the lengths of the sequences grow. We derive an analytical formula for the bias as a function of how the shared k-mers are laid out along the sequences. We show both theoretically and empirically that there are families of sequences where the bias can be substantial (e.g. the true Jaccard can be more than double the estimate). Finally, we demonstrate that this bias affects the accuracy of the widely used mashmap read mapping tool. Availability and implementation Scripts to reproduce our experiments are available at https://github.com/medvedevgroup/minimizer-jaccard-estimator/tree/main/reproduce. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Antonio Blanca

Sampling in Uniqueness from the Potts and Random-Cluster Models on Random Regular Graphs

Spatial Mixing and Non-local Markov chains

The statistics ofk-mers from a sequence undergoing a simple mutation process without spurious matches

On Universal Cycles for new Classes of Combinatorial Structures

The minimizer Jaccard estimator is biased and inconsistent

Contact Info

Product

Resources

About