A recent study by one of the authors has demonstrated the importance of profile vectors in DNA-based data storage. We provide exact values and lower bounds on the number of profile vectors for finite values of alphabet size q, read length , and word length n. Consequently, we demonstrate that for q ≥ 2 and n ≤ q /2−1 , the number of profile vectors is at least q κn with κ very close to 1. In addition to enumeration results, we provide a set of efficient encoding and decoding algorithms for each of two particular families of profile vectors.
We propose a construction of de Bruijn sequences by the cycle joining method from linear feedback shift registers (LFSRs) with arbitrary characteristic polynomial f (x). We study in detail the cycle structure of the set Ω ( f (x)) that contains all sequences produced by a specific LFSR on distinct inputs and provide a fast way to find a state of each cycle. This leads to an efficient algorithm to find all conjugate pairs between any two cycles, yielding the adjacency graph. The approach is practical to generate a large class of de Bruijn sequences up to order n ≈ 20. Many previously proposed constructions of de Bruijn sequences are shown to be special cases of our construction.Keywords Binary periodic sequence · LFSR · de Bruijn sequence · cycle structure · adjacency graph · cyclotomic number Mathematics Subject Classification (2010) 11B50 · 94A55 · 94A60 1 Introduction A binary de Bruijn sequence of order n has period N = 2 n in which each n-tuple occurs exactly once in each period. There are 2 2 n−1 −n of them [5]. Some of their earliest applications are in communication systems. They are generated in a deterministic way, yet satisfy the randomness criteria in [14, Ch. 5] and are balanced, containing the same number of 1s and 0s. In cryptography, they have been used as a source of pseudo-random numbers and in keysequence generators of stream ciphers [29, Sect. 6.3]. In computational molecular biology, one of the three assembly paradigms in DNA sequencing is the de Bruijn graph assemblers model [31, Box 2]. Some roles of de Bruijn sequences in robust positioning patterns are discussed by Bruckstein et al. in [4]. They have numerous applications, e.g., in robotics,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.