A Corpus Balancing Method for Language Model Construction

Villaseñor-Pineda, Luis; Montes-y-Gómez, Manuel; Pérez-Coutiño, Manuel; Vaufreydaz, Dominique

doi:10.1007/3-540-36456-0_40

Cited by 6 publications

(7 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This combination is based on the pertinence of the translations to the target document collection. This pertinence, as in the previous method, expresses how a given translation fits in 3 The n-gram model was constructed using the method described in [15].…”

Section: Methods 2: "Combining Passages From Several Translations"mentioning

confidence: 99%

Enhancing Cross-Language Question Answering by Combining Multiple Question Translations

Aceves-Pérez

Montes-y-Gómez

Villaseñor-Pineda

2007

Computational Linguistics and Intelligent Text Processing

Self Cite

View full text Add to dashboard Cite

Abstract.One major problem of state-of-the-art Cross Language Question Answering systems is the translation of user questions. This paper proposes combining the potential of multiple translation machines in order to improve the final answering precision. In particular, it presents three different methods for this purpose. The first one focuses on selecting the most fluent translation from a given set; the second one combines the passages recovered by several question translations; finally, the third one constructs a new question reformulation by merging word sequences from different translations. Experimental results demonstrated that the proposed approaches allow reducing the error rates in relation to a monolingual question answering exercise.

show abstract

Section: Methods 2: "Combining Passages From Several Translations"mentioning

confidence: 99%

Enhancing Cross-Language Question Answering by Combining Multiple Question Translations

Aceves-Pérez

Montes-y-Gómez

Villaseñor-Pineda

2007

Computational Linguistics and Intelligent Text Processing

Self Cite

View full text Add to dashboard Cite

show abstract

“…In particular, in Mexico there have been some interesting efforts related to the use of the web for the automatic construction of domain-specific ontologies [16], training sets for text classification tasks [6,7], and language models for speech recognition [28]. The following sections give a brief overview of these works.…”

Section: Extracting Information From the Webmentioning

confidence: 99%

“…The construction of this corpus is not a simple task since written texts do not represent adequately many phenomena of spontaneous speech. In order to alleviate this problem, [28] proposes the use of web documents as data source. This proposal was based on the fact that many people around the world contribute to create the web, and therefore, that most of its documents comprise informal contents and include many everyday as well as non-grammatical expressions used in spoken language.…”

Section: Tuning Task-specific Language Models Through Web Datamentioning

confidence: 99%

See 1 more Smart Citation

Information Extraction, Search, Interaction and Collaboration on the Web in Mexico

Sánchez¹,

Chávez

Montes

2008

2008 Latin American Web Conference

View full text Add to dashboard Cite

show abstract

“…In addition to Keller and Lapata (this issue) and references therein, Volk (2001) gathers lexical statistics for resolving prepositional phrase attachments, and Villasenor-Pineda et al (2003) "balance" their corpus using Web documents.…”

Section: Some Current Themesmentioning

confidence: 99%

Untitled

2003

Computational Linguistics

View full text Add to dashboard Cite

A Corpus Balancing Method for Language Model Construction

Cited by 6 publications

References 5 publications

Enhancing Cross-Language Question Answering by Combining Multiple Question Translations

Enhancing Cross-Language Question Answering by Combining Multiple Question Translations

Information Extraction, Search, Interaction and Collaboration on the Web in Mexico

Untitled

Contact Info

Product

Resources

About