N-Gram vs. Keyword-Based Passage Retrieval for Question Answering

Buscaldi, Davide; Gómez, José Manuel; Rosso, Paolo; Sanchís, Emilio

doi:10.1007/978-3-540-74999-8_45

Cited by 5 publications

(3 citation statements)

References 6 publications

(4 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This PR system uses a weighting scheme based on n-grams density. It was proved in [1] that this approach is more effective in the PR and QA tasks than other commonly used IR systems based on keywords and the well-known TF.IDF weighting scheme. So, JIRS works under the premise that, in a sufficiently large document collection, question n-grams should appear near the answer at least once.…”

Section: The Jirs Passage Retrieval Systemmentioning

confidence: 99%

Voice-QA: Evaluating the Impact of Misrecognized Words on Passage Retrieval

Calvo

Buscaldi

Rosso

2012

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. Question Answering is an Information Retrieval task where the query is posed using natural language and the expected result is a concise answer. Voice-activated Question Answering systems represent an interesting application, where the question is formulated by speech. In these systems, an Automatic Speech Recognition module can be used to transcribe the question. Thus, recognition errors may be introduced, producing a significant effect on the answer retrieval process. In this work we study the relationship between some features of misrecognized words and the retrieval results. The features considered are the redundancy of a word in the result set and its inverse document frequency calculated over the collection. The results show that the redundancy of a word may be an important clue on whether an error on it would deteriorate the retrieval results, at least if a closed model is used for speech recognition.

show abstract

Section: The Jirs Passage Retrieval Systemmentioning

confidence: 99%

Voice-QA: Evaluating the Impact of Misrecognized Words on Passage Retrieval

Calvo

Buscaldi

Rosso

2012

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

show abstract

“…Therefore, our system cannot solve anaphoras. We refer the reader to the description in [2] for a detailed description of the base system.…”

Section: Wordnet-based Index Expansionmentioning

confidence: 99%

“…Our system is constituted by a modified version of the QUASAR system described in [2]. For this task the search engine (JIRS) has been replaced by Lucene 2 , which can work with multiple indices.…”

Section: Introductionmentioning

confidence: 99%

Some Experiments in Question Answering with a Disambiguated Document Collection

Buscaldi

Rosso

2009

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. This paper describes our approach to the Question Answering -Word Sense Disambiguation task. This task consists in carrying out Question Answering over a disambiguated document collection. In our approach, disambiguated documents are used to improve the accuracy of the retrieval phase. In order to do this, we added a WordNet-expanded index to the document collection. The expanded index contains synonyms, hypernyms and holonyms of the words already in the documents. Question words are searched for in both the expanded WordNet index and the default index. The obtained results show that the system that exploited disambiguation obtained better precision than the non-WSD one.

show abstract