Proceedings of the 16th Conference on Computational Linguistics - 1996
DOI: 10.3115/993268.993360
|View full text |Cite
|
Sign up to set email alerts
|

Formal description of multi-word lexemes with the finite-state formalism IDAREX

Abstract: Most multi-word lexemes (MWLs) allow certain types of variation. This has to be taken into account for their description and their recognition in texts. We suggest to describe their syntactic restrictions and their idiosyncratic peculiarities with local grammar rules, which at the same time allow to express in a general way regularities valid for a whole class of MWLs. The local grammars can be written in a very convenient and compact way as regular expressions in the formalism IDAREX which uses a two-level mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2000
2000
2020
2020

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 4 publications
0
6
0
Order By: Relevance
“…Rules can be as simple as direct matching, but can also use more sophisticated, context-sensitive constraints encoded as finite-state transducers for instance. Historically, rules based on finite-state transducers offered a simple generic framework to deal with variability, discontiguity, and ambiguity (Gross 1989;Breidt, Segond, and Valetto 1996). Identification methods for contiguous MWEs such as open compounds are generally based on dictionaries compiled into finite-state transducers.…”
Section: Rule-based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Rules can be as simple as direct matching, but can also use more sophisticated, context-sensitive constraints encoded as finite-state transducers for instance. Historically, rules based on finite-state transducers offered a simple generic framework to deal with variability, discontiguity, and ambiguity (Gross 1989;Breidt, Segond, and Valetto 1996). Identification methods for contiguous MWEs such as open compounds are generally based on dictionaries compiled into finite-state transducers.…”
Section: Rule-based Methodsmentioning
confidence: 99%
“…Another approach comprises two processing stages: morphological analysis of simple words followed by a composition of regular rules to identify MWEs, as in Oflazer, Çetinoglu, and Say (2004) for Turkish. Breidt, Segond, and Valetto (1996) design regular rules that handle morphological variations and restrictions like the French idiom perdre ADV* :la :tête (lit. lose ADV* :the :head, 'to lose one's mind'), 16 lexical and structural variations (birth date = date of birth).…”
Section: Rule-based Methodsmentioning
confidence: 99%
“…More elaborate are approaches based on finitestate-related formalisms. They usually indicate the morphological categories and features of individual MWE components, and offer rule-based combinatorial description of their variability patterns (Karttunen et al, 1992;Breidt et al, 1996;Oflazer et al, 2004;Silberztein, 2005;Krstev et al, 2010;Al-Haj et al, 2014;Lobzhanidze, 2017;Czerepowicka and Savary, 2018). They mostly cover continuous (e.g.…”
Section: Lexicons Of Mwesmentioning
confidence: 99%
“…Any natural language processing (NLP) system needs to address the issue of handling multiword expressions, including Phrasal Verbs (PV) [Sag et al 2002;Breidt et al 1996]. This paper presents a proven approach to identifying English PVs based on pattern matching using a formalism called Expert Lexicon.…”
Section: Introductionmentioning
confidence: 99%