Olatz Arregi scite author profile

Olatz Arregi

5Publications

24Citation Statements Received

59Citation Statements Given

How they've been cited

How they cite others

Affiliations

Polymat, University of the Basque Country

Publications

Order By: Most citations

Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque

Urbizu¹,

Soraluze²,

Arregi³

2019

View full text Add to dashboard Cite

In this paper, we present a cross-lingual neural coreference resolution system for a lessresourced language such as Basque. To begin with, we build the first neural coreference resolution system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). Next, a cross-lingual coreference resolution system is designed. With this approach, the system learns from a bigger English corpus, using cross-lingual embeddings, to perform the coreference resolution for Basque. The cross-lingual system obtains slightly better results (40.93 F1 CoNLL) than the monolingual system (39.12 F1 CoNLL), without using any Basque language corpus to train it.

show abstract

Improving mention detection for Basque based on a deep error analysis

Soraluze

Arregi

et al. 2016

Nat. Lang. Eng.

View full text Add to dashboard Cite

This paper presents the improvement process of a mention detector for Basque. The system is rule-based and takes into account the characteristics of mentions in Basque. A classification of error types is proposed based on the errors that occur during mention detection. A deep error analysis distinguishing error types and causes is presented and improvements are proposed. At the final stage, the system obtains an F-measure of 74.57% under the Exact Matching protocol and of 80.57% under Lenient Matching. We also show the performance of the mention detector with gold standard data as input, in order to omit errors caused by the previous stages of linguistic processing. In this scenario, we obtain an F-measure of 85.89% with Strict Matching and of 89.06% with Lenient Matching, i.e., a difference of 11.32 and 8.49 percentage points, respectively. Finally, how improvements in mention detection affect coreference resolution is analysed.

show abstract

A multiclass/multilabel document categorization system: Combining multiple classifiers in a reduced dimension

Zelaia

Alegria

Arregi

et al. 2011

Applied Soft Computing

View full text Add to dashboard Cite

IXAGroupEHUDiac: A Multiple Approach System towards the Diachronic Evaluation of Texts

Salaberri

Arregi

et al. 2015

View full text Add to dashboard Cite

This paper presents our contribution to the SemEval-2015 Task 7. The task was subdivided into three subtasks that consisted of automatically identifying the time period when a piece of news was written (1,2) as well as automatically determining whether a specific phrase in a sentence is relevant or not for a given period of time (3). Our system tackles the resolution of all three subtasks. With this purpose in mind multiple approaches are undertaken that use resources such as Wikipedia or Google NGrams. Final results are obtained by combining the output from all approaches. The texts used for the task are written in English and range from the years 1700 to 2000.

show abstract

IXAGroupEHUSpaceEval: (X-Space) A WordNet-based approach towards the Automatic Recognition of Spatial Information following the ISO-Space Annotation Scheme

Salaberri¹,

Arregi²,

Zapirain³

2015

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Olatz Arregi

Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque

Improving mention detection for Basque based on a deep error analysis

A multiclass/multilabel document categorization system: Combining multiple classifiers in a reduced dimension

IXAGroupEHUDiac: A Multiple Approach System towards the Diachronic Evaluation of Texts

IXAGroupEHUSpaceEval: (X-Space) A WordNet-based approach towards the Automatic Recognition of Spatial Information following the ISO-Space Annotation Scheme

Contact Info

Product

Resources

About