Itziar Aduriz scite author profile

Abstract. This article presents a robust syntactic analyser for Basque and the different modules it contains. Each module is structured in different analysis layers for which each layer takes the information provided by the previous layer as its input; thus creating a gradually deeper syntactic analysis in cascade. This analysis is carried out using the Constraint Grammar (CG) formalism. Moreover, the article describes the standardisation process of the parsing formats using XML.

show abstract

Coreferential Relations in Basque: The Annotation Process

Ceberio¹,

Aduriz

Ilarraza³

et al. 2018

J Psycholinguist Res

View full text Add to dashboard Cite

In this paper we present the coreferential tagging of part of the EPEC Corpus of Basque. Although coreference is a pragmatic linguistic phenomenon highly dependent on the situational context, it shows some language-specific patterns that vary according to the features of each language. Due to the fact that Basque is not an Indo-European language, it differs considerably in grammar from the languages spoken in surrounding areas. We will explain these features and the decisions made in each case. After describing the criteria defined for coreferential tagging in Basque, the annotation process will be explained. Our annotation is based on a morphologically and syntactically annotated corpus that provides us with a manageable environment, in which the specific structures that are part of a reference chain can be more easily identified. A part of the corpus was tagged by two annotators who marked up the same text independently, and by another annotator that acted as judge, solving problems in case of disagreement. All this process has been automatized as a result of previous studies carried out in this field. The automatic detection of mentions (Soraluze et al., in: Proceedings of Konvens, 2012) has provided us with a better working environment, and given us the possibility to build a first significant corpus for a later computational treatment of automatic coreferential resolution.

show abstract

Rule-Based Translation of Spanish Verb-Noun Combinations into Basque

Iñurrieta¹,

Aduriz²,

Ilarraza³

et al. 2017

View full text Add to dashboard Cite

This paper presents a method to improve the translation of Verb-Noun Combinations (VNCs) in a rule-based Machine Translation (MT) system for SpanishBasque. Linguistic information about a set of VNCs is gathered from the public database Konbitzul, and it is integrated into the MT system, leading to an improvement in BLEU, NIST and TER scores, as well as the results being significantly better according to human evaluators.

show abstract

A spelling corrector for Basque based on morphology

Aduriz¹,

Urkia²,

Alegria³

et al. 1997

Literary and Linguistic Computing

View full text Add to dashboard Cite

Testu-corpusen informazio morfosintaktikoaren etiketatze automatikoa hizkuntz ezagutzan oinarrituz: zenbait arazo, hainbat erronka

Aduriz¹,

Arriola²

2020

View full text Add to dashboard Cite

show abstract

A word-grammar based morphological analyzer for agglutinative languages

Aduriz

Agirre

Aldezabal

et al. 2000

View full text Add to dashboard Cite

Agglutinative languages presenl rich morphology and for sonic applications they lleed deep analysis at word level. Tile work here presenled proposes a model for designing a full nlorphological analyzer. The model integrates lhe two-level fornlalisnl alld a ullificalion-I)asod fornialisni. In contrast to other works, we propose to separate the treatment of sequential and non-sequetTtial mou)holactic constraints. Sequential constraints are applied in lhe seglllenlalion phase, and non-seqtlontial OlleS ill the filial feature-combination phase. Early application of sequential nlorpholactic coilsli'aiills during tile segnloillaiioi/ process nlakes feasible :,ill officienl iinplenleilialion of tile full morphological analyzer. The result of lhis research has been tile design and imi)len~entation of a full nlorphosynlactic analysis procedure for each word in unrestricted Basque texts.

show abstract

Designing spelling correctors for inflected languages using lexical transducers

Aldezabal

Alegria

Ansa

et al. 1999

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Itziar Aduriz

Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for automatic processing

A Cascaded Syntactic Analyser for Basque

Coreferential Relations in Basque: The Annotation Process

Rule-Based Translation of Spanish Verb-Noun Combinations into Basque

A spelling corrector for Basque based on morphology

Testu-corpusen informazio morfosintaktikoaren etiketatze automatikoa hizkuntz ezagutzan oinarrituz: zenbait arazo, hainbat erronka

A word-grammar based morphological analyzer for agglutinative languages

Designing spelling correctors for inflected languages using lexical transducers

Contact Info

Product

Resources

About