1982
DOI: 10.1093/nar/10.1.197
|View full text |Cite
|
Sign up to set email alerts
|

Efficient algorithms for folding and comparing nucleic acid sequences

Abstract: Fast algorithms for analysing sequence data are presented. An algorithm for strict homologies finds all common subsequences of length greater than or equal to 6 in two given sequences. With it, nucleic acid pieces five thousand nucleotides long can be compared in five seconds on CDC 6600. Secondary structure algorithms generate the N most stable secondary structures of an RNA molecule, taking into account all loop contributions, and the formation of all possible base-pairs in stems, including odd pairs (G.G., … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
40
0

Year Published

1984
1984
2011
2011

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 127 publications
(41 citation statements)
references
References 23 publications
(32 reference statements)
0
40
0
Order By: Relevance
“…The large differences seen in the mfold-predicted structures would be expected to have profound effects on the accessibility to duplex formation by oligonucleotides and lead to qualitatively different patterns rather than the variation in duplex yield observed+ Several thermodynamic-based methods were devised in the 1980s to predict the secondary structure of RNAs (e+g+, Nussinov & Jacobson, 1980;Dumas & Ninio, 1982) and were improved subsequently+ These methods were mostly based on the known structure of tRNAs and thermodynamic studies of small RNA fragments in solution (e+g+, Tinoco et al+, 1973)+ tRNAs are comparatively small and probably exist in the minimal global free energy state and so their secondary structure can be accurately predicted by computational methods+ The algorithms devised by Zuker and colleagues (e+g+, Zuker, 1989) became widely used (e+g+, mfold) because of their ability to calculate the free energy of folds of longer sequences+ For long RNA molecules, energy calculations often return a number of different structures with similar free energies: there is difficulty in choosing the correct one+ We analyzed the full length B5 transcript to see if the free energy of the most stable fold differed significantly from that of the fold in which the 59 end was constrained to its most stable fold+ The minimum free energy of the full transcript was Ϫ250+9 kcal/mol and the sum of the free energies of the two parts folded separately was Ϫ248+2 kcal/mol, showing that there was little difference between them+ This result reemphasizes the problem of using global free energy calculations+ Our results further indicate that the predicted structures have limited usage in biological application such as designing antisense oligonucleotides+ Ho et al+ (1998) have also shown that accessible sites cannot be mapped on predicted structures and that there is no obvious structural difference between accessible and inaccessible regions on the computer folded structures+ Gaspin & Westhof (1995) have devised an approach according to the RNA hierarchical folding view that allows for the dynamic incorporation of folding constraints, enabling the user to participate in the computational folding of RNA+ They predicted the secondary structure of Group I intron Td and RNAseP of Escherichia coli with this method, and the results were similar to those predicted by phylogenetic comparison+ Despite its demonstrated usefulness, such an approach requires substantial experimental data for input that may not be available for some biological applications, in which case empirical approaches become more useful+ Our results suggest ways in which hybridization to arrays may be used in predicting the secondary interactions in RNA molecules+ First, the hybridization data can help in testing short range interactions+ Studies of tRNA (K+U+ Mir & E+M+ Southern, submitted) suggest that strong interactions of oligonucleotides with the target, which are easily identified on the arrays, indicate certain stem-loop structures, and may point to stack interfaces+ The 59 regions of B5 mRNA that hybridize strongly to oligonucleotides, shown against the predicted secondary structure of lowest free energy (Fig+ 6), conform with the partial rules derived from the analysis of tRNA+ Second, long-range interactions are indicated by the loss of hybridization that results from extending the transcript+ The sequences of the oligonucleotides whose hybridization is blocked by the secondary interactions can be read directly from the array to locate the interacting regions p...…”
Section: Mfold Predicted Structures and Implication Of The Hybridizatmentioning
confidence: 99%
“…The large differences seen in the mfold-predicted structures would be expected to have profound effects on the accessibility to duplex formation by oligonucleotides and lead to qualitatively different patterns rather than the variation in duplex yield observed+ Several thermodynamic-based methods were devised in the 1980s to predict the secondary structure of RNAs (e+g+, Nussinov & Jacobson, 1980;Dumas & Ninio, 1982) and were improved subsequently+ These methods were mostly based on the known structure of tRNAs and thermodynamic studies of small RNA fragments in solution (e+g+, Tinoco et al+, 1973)+ tRNAs are comparatively small and probably exist in the minimal global free energy state and so their secondary structure can be accurately predicted by computational methods+ The algorithms devised by Zuker and colleagues (e+g+, Zuker, 1989) became widely used (e+g+, mfold) because of their ability to calculate the free energy of folds of longer sequences+ For long RNA molecules, energy calculations often return a number of different structures with similar free energies: there is difficulty in choosing the correct one+ We analyzed the full length B5 transcript to see if the free energy of the most stable fold differed significantly from that of the fold in which the 59 end was constrained to its most stable fold+ The minimum free energy of the full transcript was Ϫ250+9 kcal/mol and the sum of the free energies of the two parts folded separately was Ϫ248+2 kcal/mol, showing that there was little difference between them+ This result reemphasizes the problem of using global free energy calculations+ Our results further indicate that the predicted structures have limited usage in biological application such as designing antisense oligonucleotides+ Ho et al+ (1998) have also shown that accessible sites cannot be mapped on predicted structures and that there is no obvious structural difference between accessible and inaccessible regions on the computer folded structures+ Gaspin & Westhof (1995) have devised an approach according to the RNA hierarchical folding view that allows for the dynamic incorporation of folding constraints, enabling the user to participate in the computational folding of RNA+ They predicted the secondary structure of Group I intron Td and RNAseP of Escherichia coli with this method, and the results were similar to those predicted by phylogenetic comparison+ Despite its demonstrated usefulness, such an approach requires substantial experimental data for input that may not be available for some biological applications, in which case empirical approaches become more useful+ Our results suggest ways in which hybridization to arrays may be used in predicting the secondary interactions in RNA molecules+ First, the hybridization data can help in testing short range interactions+ Studies of tRNA (K+U+ Mir & E+M+ Southern, submitted) suggest that strong interactions of oligonucleotides with the target, which are easily identified on the arrays, indicate certain stem-loop structures, and may point to stack interfaces+ The 59 regions of B5 mRNA that hybridize strongly to oligonucleotides, shown against the predicted secondary structure of lowest free energy (Fig+ 6), conform with the partial rules derived from the analysis of tRNA+ Second, long-range interactions are indicated by the loss of hybridization that results from extending the transcript+ The sequences of the oligonucleotides whose hybridization is blocked by the secondary interactions can be read directly from the array to locate the interacting regions p...…”
Section: Mfold Predicted Structures and Implication Of The Hybridizatmentioning
confidence: 99%
“…The alignment algorithm used by the IFIND program is based on the work of Dumas and Ninio (1982), Needleman and Wunsch (1970), and Wilbur and Lipman (1983). The parameter settings used were window size = 20, word length = 1, gap penalty = 2, fast = yes, and density = less.…”
Section: Analysis Of Protein Sequence Similaritiesmentioning
confidence: 99%
“…Pieczenik and Garber (22) have also modified the original algorithm, by treating consecutive potential base pairings as a single unit on the chain. New algorithms have recently been published by Dumas and Ninio (23) and by Goad and Kanehisa (24). They compute either locally optimal structure (24) or overall lowest energy structure for relatively short sequences (23) *To whom all correspondence should be addressed 2Present address: Theoretical Biology, Group T-10, MS M710, Los Alamos National Laboratory, Los Alamos, NM 87545, USA…”
Section: Discussionmentioning
confidence: 99%