Donald E. Knuth scite author profile

An algorithm is presented which finds all occurrences of one. given string within another, in running time proportional to the sum of the lengths of the strings. The constant of proportionality is low enough to make this algorithm of practical use, and the procedure can also be extended to deal with some more general pattern-matching problems. A theoretical application of the algorithm shows that the set of concatenations of even palindromes, i.e., the language {can}*, can be recognized in linear time. Other algorithms which run even faster on the average are also considered. Key words, pattern, string, text-editing, pattern-matching, trie memory, searching, period of a string, palindrome, optimum algorithm, Fibonacci string, regular expression Text-editing programs are often required to search through a string of characters looking for instances of a given "pattern" string; we wish to find all positions, or perhaps only the leftmost position, in which the pattern occurs as a contiguous substring of the text. For example, c a e n a r y contains the pattern e n, but we do not regard c a n a r y as a substring. The obvious way to search for a matching pattern is to try searching at every starting position of the text, abandoning the search as soon as an incorrect character is found. But this approach can be very inefficient, for example when we are looking for an occurrence of aaaaaaab in aaaaaaaaaaaaaab. When the pattern is a"b and the text is a2"b, we will find ourselves making (n + 1) *

show abstract

Concrete Mathematics: A Foundation for Computer Science

Graham

1989

View full text Add to dashboard Cite

Semantics of context-free languages

Knuth

1968

Math. Systems Theory

1,543

602

View full text Add to dashboard Cite

ABSTRACT"Meaning" may be assigned to a string in a context-free language by defining "attributes" of the symbols in a derivation tree for that string. The attributes can be defined by functions associated with each production in the grammar. This paper examines the implications of this process when some of the attributes are "synthesized", i.e., defined solely in terms of attributes of the descendants of the corresponding nonterminal symbol, while other attributes are "inherited", i.e., defined in terms of attributes of the ancestors of the nonterminal symbol. An algorithm is given which detects when such semantic rules could possibly lead to circular definition of some attributes. An example is given of a simple programming language defined with both inherited and synthesized attributes, and the method of definition is compared to other techniques for formal specification of semantics which have appeared in the literature.A simple technique for specifying the "meaning" of languages defined by context-free grammars is introduced in Section 1 of this paper, and its basic mathematical properties are investigated in Sections 2 and 3. An example which indicates how the technique can be applied to the formal definition of programming languages is described in Section 4, and finally, Section 5 contains a somewhat biased comparison of the present method to other known techniques for semantic definition. The discussion in this paper is oriented primarily towards programming languages, but the same methods appear to be relevant also in the study of natural languages.

show abstract

Literate Programming

Knuth

1984

The Computer Journal

975

599

View full text Add to dashboard Cite

The birth of the giant component

Janson

Knuth

Łuczak

et al. 1993

Random Struct Algorithms

343

478

View full text Add to dashboard Cite

Limiting distributions are derived for the sparse connected components that are present when a random graph on n vertices has approximately f n edges. In particular, we show that such a graph consists entirely of trees, unicyclic components, and bicyclic components with probability approaching cosh i= 0.9325 as n + m. The limiting probability that it consists o f trees, unicyclic components, and at most one another component is approximately 0.9957; the limiting probability that it is planar lies between 0.987 and 0.9998. When a random graph evolves and the number of edges passes 4n, its components grow in cyclic complexity according to an interesting Markov process whose asymptotic structure is derived. The probability that there never is more than a single component with more edges than vertices, throughout the evolution, approaches 5 wl18 = 0.8727. A "uniform" model of random graphs, which allows self-loops and multiple edges, is shown to lead to formulas that are substantially simpler than the analogous formulas for the classical random graphs of ErdBs and RCnyi. The notions of "excess" and "deficiency," which are significant characteristics of the generating function as well as of the graphs themselves, lead to a 233 the multigraph process, because it can generate graphs with self-loops x-x, and it can also generate multiple edges. Notice that a self-loop x-x is generated with probability 1 ln', while an edge x-y with x # y is generated with probability 2 ln' because it can occur either as ( x , y ) or ( y, x ) .The second evolution procedure, introduced by Erdos and Rknyi [12], is called the permutation model or the graph process. In this case we consider all N = ( ) possible edges x-y with x < y and introduce them in random order, with all N! permutations considered equally likely. In this model there are no self-loops or multiple edges.A multigraph M on n labeled vertices can be defined by a symmetric n X n matrix of nonnegative integers m x y , where mxy = myx is the number of undirected edges x-y in G. For purposes of analysis, we shall assign a compensation factor to M ; if m = Ez=, E:=, mxy is the total number of edges, the number of sequences ( x , , y , ) ( x 2 , y z ) . . . ( x , , y, ) that lead to M is then exactly (The factor 2" accounts for choosing either ( x , y ) or ( y , x ) ; the 2mxx in the denominator of K ( M ) compensates for the case x = y . The other factor m ! accounts for permutations of the pairs, with mxy! in K ( M ) to compensate for permutations between multiple edges.) Equation (1.2) tells us that K ( M ) is a natural weighting factor for a multigraph M , because it corresponds to the relative frequency with which M tends to occur in applications. For example, consider multigraphs on three vertices (1, 2, 3) having exactly three edges. The edges will form the cycle M , = {1-2, 2-3, 3-1) much more often than they will form three identical self-loops M2 = { 1-1, 1-1, 1-1}, when the multigraphs are generated in a uniform way. For if we consider the 36 possible sequences ( x , , y...

show abstract

Permutations, matrices, and generalized Young tableaux

Knuth¹

1970

Pacific J. Math.

578

459

View full text Add to dashboard Cite

Simple Word Problems in Universal Algebras

Knuth¹,

Bendix²

1983

513

392

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Donald E. Knuth

On the LambertW function

Fast Pattern Matching in Strings

Concrete Mathematics: A Foundation for Computer Science

Semantics of context-free languages

Literate Programming

The birth of the giant component

Permutations, matrices, and generalized Young tableaux

Simple Word Problems in Universal Algebras

Contact Info

Product

Resources

About