2000
DOI: 10.1007/978-94-017-2535-4_7
|View full text |Cite
|
Sign up to set email alerts
|

Bracketing and aligning words and constituents in parallel text using Stochastic Inversion Transduction Grammars

Abstract: Abstract:We introduce (1) a novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and (2) the concept of bilingual parsing with a variety of parallel corpus analysis applications. Aside from the bilingual orientation, three major features distinguish the formalism from the finitestate transducers more traditionally found in computational linguistics: it skips directly to a context-free rather than finite-state base, it permits a minimal extra degree of ord… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2004
2004
2009
2009

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…In its operation, the string-to-string aligner is very similar to ITG (Wu, 2000), however its goal is the generation of a parallel treebank, rather than the induction of a bilingual grammar.…”
Section: Other Alignment Modulesmentioning
confidence: 99%
“…In its operation, the string-to-string aligner is very similar to ITG (Wu, 2000), however its goal is the generation of a parallel treebank, rather than the induction of a bilingual grammar.…”
Section: Other Alignment Modulesmentioning
confidence: 99%
“…This procedure is facilitated by the use of parallel-aligned corpora, which allow a comparison between the LUs when they are embedded in different types of context (see, e.g. Wu 2000, Salkie 2002). 14 Consider, for example, the verb answer, whose individual frame elements may be realized syntactically in many different ways.…”
Section: Linking Parallel Lexicon Fragments Via Semantic Framesmentioning
confidence: 99%
“…A first step is to use parallel corpora to automatically identify translation equivalents in context in order to determine frame membership of lexical units across languages. For approaches incorporating automatic acquisition of lexical information from parallel corpora see Wu (2000), Farwell et al (2004), Green et al (2004), and Mitamura et al (2004).…”
Section: Linking Parallel Lexicon Fragments Via Semantic Framesmentioning
confidence: 99%
“…There are basically two kinds of systems working at these segmentation levels: the most widespread rely on statistical models, in particular the IBM ones (Brown et al, 1993); others combine simpler association measures with different kinds of linguistic information (Arhenberg et al, 2000;Barbu, 2004). Mainly dedicated to machine translation, purely statistical systems have gradually been enriched with syntactic knowledge (Wu, 2000;Yamada & Knight, 2001;Ding et al, 2003;Lin & Cherry, 2003). As pointed out in these studies, the introduction of linguistic knowledge leads to a significant improvement in alignment quality.…”
Section: Introductionmentioning
confidence: 99%