2003
DOI: 10.1007/s10590-004-2480-9
|View full text |Cite
|
Sign up to set email alerts
|

Design, Implementation and Evaluation of an Inflectional Morphology Finite State Transducer for Irish

Abstract: Minority languages must endeavour to keep up with and avail of language technology advances if they are to prosper in the modern world. Finite state technology is mature, stable and robust. It is scalable and has been applied successfully in many areas of linguistic processing, notably in phonology, morphology and syntax. In this paper, the design, implementation and evaluation of a morphological analyser and generator for Irish using finite state transducers is described. In order to produce a high-quality li… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2006
2006
2023
2023

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 12 publications
(1 reference statement)
0
6
0
Order By: Relevance
“…Compared to other EU-official languages, Irish language technology is under-resourced, as highlighted by a recent study (Judge et al, 2012). In the area of morpho-syntactic processing, recent years have seen the development of a part-of-speech tagger (Uí Dhonnchadha and van Genabith, 2006), a morphological analyser (Uí Dhonnchadha et al, 2003), a shallow chunker (Uí Dhonnchadha, 2009), a dependency treebank (Lynn et al, 2012a;Lynn et al, 2012b) and statistical dependency parsing models for MaltParser (Nivre et al, 2006) and Mate parser (Bohnet, 2010) trained on this treebank (Lynn et al, 2013).…”
Section: Irish Language and Treebankmentioning
confidence: 99%
See 1 more Smart Citation
“…Compared to other EU-official languages, Irish language technology is under-resourced, as highlighted by a recent study (Judge et al, 2012). In the area of morpho-syntactic processing, recent years have seen the development of a part-of-speech tagger (Uí Dhonnchadha and van Genabith, 2006), a morphological analyser (Uí Dhonnchadha et al, 2003), a shallow chunker (Uí Dhonnchadha, 2009), a dependency treebank (Lynn et al, 2012a;Lynn et al, 2012b) and statistical dependency parsing models for MaltParser (Nivre et al, 2006) and Mate parser (Bohnet, 2010) trained on this treebank (Lynn et al, 2013).…”
Section: Irish Language and Treebankmentioning
confidence: 99%
“…Considerable efforts have been made over the past decade to develop natural language processing resources for the Irish language (Uí Dhonnchadha et al, 2003;Uí Dhonnchadha and van Genabith, 2006;Uí Dhonnchadha, 2009;Lynn et al, 2012a;Lynn et al, 2012b;Lynn et al, 2013). One such resource is the Irish Dependency Treebank (Lynn et al, 2012a) which contains just over 1000 gold standard dependency parse trees.…”
Section: Introductionmentioning
confidence: 99%
“…(Mel'čuk, 1973). In the instantiated version of the pipeline presented in this paper, the input structured data is the WebNLG data (Aquilina et al, 2023), made of DBpedia triple sets, and we use the FORGe grammar-based generator to produce the intermediate representations (Mille et al, 2019) and the Irish NLP toolkit (Dhonnchadha et al, 2003) to produce the final representation: details about the dataset and tools are provided in Section 3.…”
Section: Modular Structurementioning
confidence: 99%
“…• A tagset for Irish had been developed within the PAROLE project, by members of the NCI team (http://www.ite.ie/corpus/pos.htm) • A pilot finite-state tokenizer and morphological transducer for Irish inflectional morphology had been developed (Uí Dhonnchadha, 2002;Uí Dhonnchadha, Nic Phá idín, & Van Genabith, 2003). • We established that a constraint based tagger 9 was available to us…”
Section: Irish Linguistic Toolsmentioning
confidence: 99%
“…As newspaper and web texts in particular contain a high proportion of proper nouns, lists of names and places were also scanned and incorporated into the lexicon (Uí Dhonnchadha et al, 2003). Average recognition rates increased to 95% on unrestricted text.…”
Section: Tokenization and Morphological Analysismentioning
confidence: 99%