Abstract. Denote by S the class of standard Sturmian words. It is a class of highly compressible words extensively studied in combinatorics of words, including the well known Fibonacci words. The suffix automata for these words have a very particular structure. This implies a simple characterization (described in the paper by the Structural Lemma) of the periods of runs (maximal repetitions) in Sturmian words. Using this characterization we derive an explicit formula for the number ρ(w) of runs in words w ∈ S, with respect to their recurrences (directive sequences). We show thatfor each w ∈ S, and there is an infinite sequence of strictly growing words w k ∈ S such that lim k→∞. The complete understanding of the function ρ for a large class S of complicated words is a step towards better understanding of the structure of runs in words. We also show how to compute the number of runs in a standard Sturmian word in linear time with respect to the size of its compressed representation (recurrences describing the word). This is an example of a very fast computation on texts given implicitly in terms of a special grammar-based compressed representation (usually of logarithmic size with respect to the explicit text).
The class of finite Sturmian words consists of words having particularly simple compressed representation, which is a generalization of the Fibonacci recurrence for Fibonacci words. The subword graphs of these words (especially their compacted versions) have a very special regular structure. In this paper we investigate this structure in more detail than in previous papers and show how several syntactical properties of Sturmian words follow from their graph properties. Consequently simple alternative graph-based proofs of several known facts are presented. The very special structure of subword graphs leads also to special easy algorithms computing some parameters of Sturmian words: the number of subwords, the critical factorization point, lexicographically maximal suffixes, occurrences of subwords of a fixed length, and right special factors. These algorithms work in linear time with respect to n, the size of the compressed representation of the standard word, though the words themselves can be of exponential size with respect to n. Some of the computed parameters can be also of exponential size, however we provide their linear size compressed representations. We introduce also a new concept related to standard words: Ostrowski automata.
We investigate some repetition problems for a very special class $\mathcal{S}$ of strings called the standard Sturmian words, which have very compact representations in terms of sequences of integers. Usually the size of this word is exponential with respect to the size of its integer sequence, hence we are dealing with repetition problems in compressed strings. An explicit formula is given for the number $\rho(w)$ of runs in a standard word $w$. We show that $\rho(w)/|w|\le 4/5$ for each $w\in S$, and there is an infinite sequence of strictly growing words $w_k\in {\mathcal{S}}$ such that $\lim_{k\rightarrow \infty} \frac{\rho(w_k)}{|w_k|} = \frac{4}{5}$. Moreover, we show how to compute the number of runs in a standard Sturmian word in linear time with respect to the size of its compressed representation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.