This is the full version of the paper, and includes proofs omitted from the short version.Abstract-We focus on (partial) functions that map input strings to a monoid such as the set of integers with addition and the set of output strings with concatenation. The notion of regularity for such functions has been defined using two-way finite-state transducers, (one-way) cost register automata, and MSO-definable graph transformations. In this paper, we give an algebraic and machine-independent characterization of this class analogous to the definition of regular languages by regular expressions. When the monoid is commutative, we prove that every regular function can be constructed from constant functions using the combinators of choice, split sum, and iterated sum, that are analogs of union, concatenation, and Kleene-*, respectively, but enforce unique (or unambiguous) parsing. Our main result is for the general case of non-commutative monoids, which is of particular interest for capturing regular string-tostring transformations for document processing. We prove that the following additional combinators suffice for constructing all regular functions: (1) the left-additive versions of split sum and iterated sum, which allow transformations such as string reversal; (2) sum of functions, which allows transformations such as copying of strings; and (3) function composition, or alternatively, a new concept of chained sum, which allows output values from adjacent blocks to mix.
Population recovery is the problem of learning an unknown distribution over an unknown set of n-bit strings, given access to independent draws from the distribution that have been independently corrupted according to some noise channel. Recent work has intensively studied such problems both for the bit-flip noise channel and for the erasure noise channel.In this paper we initiate the study of population recovery under the deletion channel, in which each bit b is independently deleted with some fixed probability and the surviving bits are concatenated and transmitted. This is a far more challenging noise model than bit-flip noise or erasure noise; indeed, even the simplest case in which the population is of size 1 (corresponding to a trivial probability distribution supported on a single string) corresponds to the trace reconstruction problem, which is a challenging problem that has received much recent attention (see e.g. [DOS17a, NP17, PZ17, HPP18, HHP18]).In this work we give algorithms and lower bounds for population recovery under the deletion channel when the population size is some value > 1. As our main sample complexity upper bound, we show that for any population size = o(log n/ log log n), a population of strings from {0, 1} n can be learned under deletion channel noise using 2 n 1/2+o(1) samples. On the lower bound side, we show that at least n Ω( ) samples are required to perform population recovery under the deletion channel when the population size is , for all ≤ n 1/2−ε .Our upper bounds are obtained via a robust multivariate generalization of a polynomialbased analysis, due to Krasikov and Roddity [KR97], of how the k-deck of a bit-string uniquely identifies the string; this is a very different approach from recent algorithms for trace reconstruction (the = 1 case). Our lower bounds build on moment-matching results of Roos [Roo00] and Daskalakis and Papadimitriou [DP15].
A new type of image, existing only at field strengths below the denaturation field strengths of molecules, has been discorvered. This type of image has structure, is not symmetrical and thus differs from previously reported low field strength images. The possibility that macromolecules adsorbed on tip surfaces produce such structured images has been exhaustively investigated. The result is that no observation has been found which disproves this hypothesis and many tests conducted in such attempts yielded correlations consistent with this hypothesis. It is therefore concluded that it is highly probable that biomolecules produce structured images and that it is highly improbable that these correlations represent a chance event. On the other hand, these correlations may be due to some cause unknown to the authors; but we consider this possibility to be unlikely. Some micrographs have been obtained which provide a reasonable basis for the hope that tertiary structure information may be defrived from low field strength imaging of partially embedded biomolecules or of biomolecules that are resistant to field denaturation during imaging. Information may presently be obtained from analysis of low field strength ion micrographs about the size and shape of some biomolecules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.