1984
DOI: 10.1093/nar/12.5.2561
|View full text |Cite
|
Sign up to set email alerts
|

Genome structure described by formal languages

Abstract: Nucleic acid sequences may be looked upon as words over the alphabet of nucleotides. Naturally occurring DNAs and RNAs form subsets of the set of all possible words. The use of formal languages is proposed to describe the structure of these subsets. Regular languages defined by finite automata are introduced to demonstrate the application of the concept on RNA-phages of group I. This approach permits a concise characterization of grammatical patterns in genetic information.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
25
0

Year Published

1994
1994
2016
2016

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 67 publications
(29 citation statements)
references
References 12 publications
0
25
0
Order By: Relevance
“…Although the theory of formal languages was born in the 1950s, and then almost simultaneously to modern molecular biology (recall that F. Crick and J. Watson discovered DNA's double helix in 1953 and N. Chomsky published Syntactic structures in 1957), it was not until the 1980s that formal grammars methods started to be applied to biomolecular sequences [19]. A little later it was also noticed that string grammars could also be used to model and study not only the primary structure of biomolecules, but also certain aspects of their contact structures, as for instance secondary structures of RNA molecules [20,21].…”
Section: Higher-dimensional Structures Of Biomoleculesmentioning
confidence: 99%
“…Although the theory of formal languages was born in the 1950s, and then almost simultaneously to modern molecular biology (recall that F. Crick and J. Watson discovered DNA's double helix in 1953 and N. Chomsky published Syntactic structures in 1957), it was not until the 1980s that formal grammars methods started to be applied to biomolecular sequences [19]. A little later it was also noticed that string grammars could also be used to model and study not only the primary structure of biomolecules, but also certain aspects of their contact structures, as for instance secondary structures of RNA molecules [20,21].…”
Section: Higher-dimensional Structures Of Biomoleculesmentioning
confidence: 99%
“…While such an approach has been proposed [Brendel and Busse, 1984], most investigations along these lines have used grammar formalisms as tools for what are essentially information-theoretic studies [Ebeling and Jimenez-Montano, 1980;Jimenez-Montano, 1984], or have involved statistical analyses at the level of vocabularies (reflecting a more traditional notion of comparative linguistics) [Brendel et al, 1986;Pevzner et al, 1989a,b;Pietrokovski et al, 1990]. Only very recently have generative grammars for their own sake been viewed as models of biological phenomena such as gene regulation [ColladoVides, 1989a[ColladoVides, ,b, 1991a, gene structure and expression [Searls, 1988], recombination [Head, 1987] and other forms of mutation and rearrangement [Searls, 1989a], conformation of macromolecules [Searls, 1989a], and in particular as the basis for computational analysis of sequence data [Searls, 1989b;Searls and Liebowitz, 1990;Searls and Noordewier, 1991].…”
Section: Introductionmentioning
confidence: 99%
“…The RNA molecule consists of sequences that are built of nucleotides, which are in four forms ; a, u(uracil), g, c. The complementary pair for RNA (DNA) is given asā = u(t),ū(t) = a,ḡ = c andc = g. Based on the complementary pairs in chemical objects and other biological constraints, sequences form patterns and these patterns are considered structures. These structures play a vital role in governing the functionality and behavior of bio-molecules (Brendel and Busse, 1984;Searls, 1993).…”
mentioning
confidence: 99%
“…The formal language notations for such structures and for a few other structures are discussed in detail in the coming sections. For more details on genome structures, their corresponding languages and gene structure prediction using linguistic methods, we refer to the works of Brendel and Busse (1984), Chiang et al (2006), Dong and Searls (1994), Durbin et al (1998), as well as Searls (1988;1992;. The structures that are formed in RNA are mostly intermolecular.…”
mentioning
confidence: 99%
See 1 more Smart Citation