Probabilistic topic models with biased propagation on heterogeneous information networks

This paper presents the description of 12 systems submitted to the WMT16 IT-task, covering six different languages, namely Basque, Bulgarian, Dutch, Czech, Portuguese and Spanish. All these systems were developed under the scope of the QTLeap project, presenting a common strategy. For each language two different systems were submitted, namely a phrasebased MT system built using Moses, and a system exploiting deep language engineering approaches, that in all the languages but Bulgarian was implemented using TectoMT. For 4 of the 6 languages, the TectoMT-based system performs better than the Moses-based one.

show abstract

LX-LR4DistSemEval: a collection of language resources for the evaluation of distributional semantic models of Portuguese

Querido

Carvalho

Rodrigues

et al. 2017

RAPL

View full text Add to dashboard Cite

In this paper we describe a collection of publicly available data sets for Portuguese that are suitable for the evaluation of distributional semantics models in lexical similarity tasks and in conceptual categorization tasks. These data sets were adapted from English gold-standard test sets, allowing any Portuguese distributional semantics model to be evaluated and also to be compared to mainstream results that have been obtained for this language. We also present an online service that showcases some functionalities of the distributional semantics models.

show abstract

Domain-Specific Hybrid Machine Translation from English to Portuguese

Rodrigues

Gomes

Neale

et al. 2016

View full text Add to dashboard Cite

Machine translation (MT) from English to Portuguese has not typically received much attention in existing research. In this paper, we focus on MT from English to Portuguese for the specific domain of information technology (IT), building a small in-domain parallel corpus to address the lack of IT-specific and publicly-available parallel corpora and then adapted an existing hybrid MT system to the new language pair (English to Portuguese). We further improved the initial version of the EN-PT hybrid system by adding various modules to address the most frequently occurring errors in the initial system. In order to assess the improvements achieved by each of these dedicated modules, we compared all versions of our MT system automatically. In addition, we conduct and report on a detailed error analysis of the initial and final versions of our system.

show abstract

Named Entities in the QTLeap Corpus of Online Helpdesk Interactions

Querido¹,

Carvalho²,

Rodrigues³

et al. 2016

rapl

View full text Add to dashboard Cite

Abstract:In this paper we present the annotation of a corpus with named entities that are classified into semantic types and disambiguated by linking them to their corresponding entry in the Portuguese DBpedia. This corpus, QTLeap Corpus, is a multilingual collection of question and answer pairs from a chat-based helpdesk service for Information and Communication Technologies. The resulting annotated corpus is a gold-standard named entity annotated lexical resource that is useful in supporting the training and evaluation of named entity annotation and disambiguation tools for Portuguese.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Andreia Querido

SMT and Hybrid systems of the QTLeap project in the WMT16 IT-task

LX-LR4DistSemEval: a collection of language resources for the evaluation of distributional semantic models of Portuguese

Domain-Specific Hybrid Machine Translation from English to Portuguese

Named Entities in the QTLeap Corpus of Online Helpdesk Interactions

Contact Info

Product

Resources

About