Kepa Bengoetxea scite author profile

Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for cross-linguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.

show abstract

Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool

Atutxa

Bengoetxea

Ilarraza

et al. 2019

PLoS ONE

View full text Add to dashboard Cite

Lately, discourse structure has received considerable attention due to the benefits its application offers in several NLP tasks such as opinion mining, summarization, question answering, text simplification, among others. When automatically analyzing texts, discourse parsers typically perform two different tasks: i ) identification of basic discourse units (text segmentation) ii ) linking discourse units by means of discourse relations, building structures such as trees or graphs. The resulting discourse structures are, in general terms, accurate at intra-sentence discourse-level relations, however they fail to capture the correct inter-sentence relations. Detecting the main discourse unit (the Central Unit) is helpful for discourse analyzers (and also for manual annotation) in improving their results in rhetorical labeling. Bearing this in mind, we set out to build the first two steps of a discourse parser following a top-down strategy: i ) to find discourse units, ii ) to detect the Central Unit. The final step, i.e. assigning rhetorical relations, remains to be worked on in the immediate future. In accordance with this strategy, our paper presents a tool consisting of a discourse segmenter and an automatic Central Unit detector.

show abstract

Multilingual segmentation based on neural networks and pre-trained word embeddings

Iruskieta¹,

Bengoetxea²,

Salazar³

et al. 2019

View full text Add to dashboard Cite

The DISPRT 2019 workshop has organized a shared task aiming to identify cross-formalism and multilingual discourse segments. Elementary Discourse Units (EDUs) are quite similar across different theories. Segmentation is the very first stage on the way of rhetorical annotation. Still, each annotation project adopted several decisions with consequences not only on the annotation of the relational discourse structure but also at the segmentation stage. In this shared task, we have employed pre-trained word embeddings, neural networks (BiLSTM+CRF) to perform the segmentation. We report F 1 results for 6 languages: Basque (0.853), English (0.919), French (0.907), German (0.913), Portuguese (0.926) and Spanish (0.868 and 0.769). Finally, we also pursued an error analysis based on clause typology for Basque and Spanish, in order to understand the performance of the segmenter.

show abstract

On WordNet Semantic Classes and Dependency Parsing

Bengoetxea

Agirre

Nivre

et al. 2014

View full text Add to dashboard Cite

This paper presents experiments with WordNet semantic classes to improve dependency parsing. We study the effect of semantic classes in three dependency parsers, using two types of constituencyto-dependency conversions of the English Penn Treebank. Overall, we can say that the improvements are small and not significant using automatic POS tags, contrary to previously published results using gold POS tags (Agirre et al., 2011). In addition, we explore parser combinations, showing that the semantically enhanced parsers yield a small significant gain only on the more semantically oriented LTH treebank conversion.

show abstract

Dependentzia Unibertsalen eredura egokitutako euskarazko zuhaitz-bankua

Aranzabe

Atutxa

Bengoetxea

et al. 2019

EKAIA

View full text Add to dashboard Cite

Hizkuntzaren Prozesamenduan kokatzen den Dependentzia Unibertsalen proiektuaren helburua da hainbat hizkuntzatan sortu diren dependentzia-ereduan oinarritutako zuhaitz-bankuak etiketatze-eskema estandar berera egokitzea. Artikulu honetan, eredu horretara automatikoki egokitu den euskarazko zuhaitz-bankua aurkezten da; halaber, egokitzapen-lan hori nola gauzatu den deskribatzen da eta, azkenik, horretan oinarrituta, azaltzen da zer antzekotasun eta zer desberdintasun diren jatorrizko zuhaitza-bankuaren eta Dependentzia Unibertsalen eredura egokitutako zuhaitz-bankuaren artean.

show abstract

Euskararako analizatzaile sintaktiko estatistikoa hobetzeko teknikak

Bengoetxea

Gojenola²

2016

EKAIA

View full text Add to dashboard Cite

Laburpena: honetan euskararako analizatzaile sintaktiko-estatistikoen emaitzak hobetzeko helburuarekin egindako esperimentu-multzoa aurkezten da. Lan honetan teknika ez-berdinak aztertzen dira: i) zuhaitz-transformazioak, ii) analizatzaileen pilaketa, eta iii) analizatzaile-modelo desberdinen irteeren konbinazioa. Emaitza guztiak zuhaitzbankutik zuzenean hartutako urre-patroiko ezaugarri morfosintaktikoak erabiliz eta analisi morfologiko eta desanbiguatze-moduluetatik hartutako ezaugarri morfosintaktiko automatikoak erabiliz egin dira.Hitz gakoak: Dependentzietan oinarritutako analisia, Analisi morfologikoa eta desanbiguazioa, Analizatzaile sintaktikoen konbinazioa.Abstract: This paper presents a set of experiments to improve the results of the statistical syntactic analyzers for Basque. The present work has examined different techniques: i) tree transformations, ii) stacking, and iii) combinations of the output of several parsers. All the results have been obtained using gold morphosyntactic tags coming directly from the treebank and using automatic mophosyntactic tags coming from morphological analysis and disambiguation module.

show abstract

Application of feature propagation to dependency parsing

Bengoetxea

Gojenola

2009

View full text Add to dashboard Cite

This paper presents a set of experiments performed on parsing the Basque Dependency Treebank. We have applied feature propagation to dependency parsing, experimenting the propagation of several morphosyntactic feature values. In the experiments we have used the output of a parser to enrich the input of a second parser. Both parsers have been generated by Maltparser, a freely data-driven dependency parser generator. The transformations, combined with the pseudoprojective graph transformation, obtain a LAS of 77.12% improving the best reported results for Basque.

show abstract

Vision Applications in the Fishing and Fish Product Industries

Arnarson¹,

Bengoetxea²,

Pau³

2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kepa Bengoetxea

Universal Dependencies

Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool

Multilingual segmentation based on neural networks and pre-trained word embeddings

On WordNet Semantic Classes and Dependency Parsing

Dependentzia Unibertsalen eredura egokitutako euskarazko zuhaitz-bankua

Euskararako analizatzaile sintaktiko estatistikoa hobetzeko teknikak

Application of feature propagation to dependency parsing

Vision Applications in the Fishing and Fish Product Industries

Contact Info

Product

Resources

About