Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Com 2009
DOI: 10.3115/1620932.1620940
|View full text |Cite
|
Sign up to set email alerts
|

Multiple word alignment with profile hidden Markov models

Abstract: Profile hidden Markov models (Profile HMMs) are specific types of hidden Markov models used in biological sequence analysis. We propose the use of Profile HMMs for word-related tasks. We test their applicability to the tasks of multiple cognate alignment and cognate set matching, and find that they work well in general for both tasks. On the latter task, the Profile HMM method outperforms average and minimum edit distance. Given the success for these two tasks, we further discuss the potential applications of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 20 publications
(11 citation statements)
references
References 6 publications
0
11
0
Order By: Relevance
“…Since strings in biology and linguistics are substantially different, with the former being long and comprising a small alphabet and the latter being very short but containing a large number of different phones, algorithms used in biology cannot be directly applied to linguistic data. In the past decade, several approaches to automatic multiple string alignment have been developed for linguistic data (Bhargava and Kondrak, 2009;Prokić, 2010;List, 2012).…”
Section: Automatic Alignment Of Phonetic Datamentioning
confidence: 99%
“…Since strings in biology and linguistics are substantially different, with the former being long and comprising a small alphabet and the latter being very short but containing a large number of different phones, algorithms used in biology cannot be directly applied to linguistic data. In the past decade, several approaches to automatic multiple string alignment have been developed for linguistic data (Bhargava and Kondrak, 2009;Prokić, 2010;List, 2012).…”
Section: Automatic Alignment Of Phonetic Datamentioning
confidence: 99%
“…For the purpose of joint generation, we need to align triples S, P and T prior to training. The alignment of multiple strings is a challenging problem (Bhargava and Kondrak, 2009). In general, there is no obvious way of merging three pairwise alignments.…”
Section: Multi-alignmentmentioning
confidence: 99%
“…When visiting the Google search page, we may get 50 bursts, while we may obtain 200 bursts when viewing the CNN front page. If a profile HMM is to be constructed from a set of unaligned sequences, there are many local optima, and it is therefore easy for the constructing algorithm to get stuck around one of these [30]. So, for each application type, we should align its training samples to the same length.…”
Section: G He Et Almentioning
confidence: 99%