Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1306
|View full text |Cite
|
Sign up to set email alerts
|

Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations

Abstract: In this paper, we propose to boost lowresource cross-lingual document retrieval performance with deep bilingual query-document representations. We match queries and documents in both source and target languages with four components, each of which is implemented as a term interaction-based deep neural network with cross-lingual word embeddings as input. By including query likelihood scores as extra features, our model effectively learns to rerank the retrieved documents by using a small number of relevance labe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(16 citation statements)
references
References 38 publications
(56 reference statements)
0
16
0
Order By: Relevance
“…This allows to use different embeddings inside the same model and helps when two languages do not share the same space inside a single model (Cao et al, 2020). For example, Zhang et al (2019b) used bilingual representations by creating cross-lingual word embeddings using a small set of parallel sentences between the highresource language English and three low-resource African languages, Swahili, Tagalog, and Somali, to improve document retrieval performance for the African languages.…”
Section: Multilingual Language Modelsmentioning
confidence: 99%
“…This allows to use different embeddings inside the same model and helps when two languages do not share the same space inside a single model (Cao et al, 2020). For example, Zhang et al (2019b) used bilingual representations by creating cross-lingual word embeddings using a small set of parallel sentences between the highresource language English and three low-resource African languages, Swahili, Tagalog, and Somali, to improve document retrieval performance for the African languages.…”
Section: Multilingual Language Modelsmentioning
confidence: 99%
“…POSIT-DRMM (Zhang et al 2019), the recent proposed cross-lingual document retrieval model, which is designed for addressing the low-resource issue in CLIR. This model incorporates bilingual representations to capture and aggregate matching signals between an input query in the source language and a document in the target language.…”
Section: Directly Clir Modelsmentioning
confidence: 99%
“…To evaluate the model performance, we follow the conventional settings in related work (Wu et al 2017;Zhou et al 2018;Zhang et al 2019). Specifically, we first calculate the matching scores between a product attribute set and product description candidates, and then rank the matching scores of all candidates to calculate the following automatic metrics, including mean reciprocal rank (MRR) (Voorhees and others 1999), and recall at position k in n candidates (Rn@k).…”
Section: Evaluation Metricsmentioning
confidence: 99%
See 1 more Smart Citation
“…As machine translation has increased the usefulness of CLIR, recently introduced deep neural methods have improved ranking quality [4,29,43,45,47]. By and large, these techniques appear to provide a large jump in the quality of CLIR output.…”
Section: Introductionmentioning
confidence: 99%