2013
DOI: 10.1145/2483969.2483970
|View full text |Cite
|
Sign up to set email alerts
|

Combining compound recognition and PCFG-LA parsing with word lattices and conditional random fields

Abstract: The integration of compounds in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly preidentified. This article evaluates two empirical strategies to incorporate such multiword units in a real PCFG-LA parsing context: (1) the use of a grammar including compound recognition, thanks to specialized annotation schemes for compounds; (2) the use of a state-of-the-art discriminative compound prerecognizer integrating endogenous and exogenous feat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 18 publications
0
7
0
Order By: Relevance
“…In sequence tagging MWEI methods, such resources can be used as sources of lexical features (Schneider et al, 2014). In parsing-based approaches they may serve as a basis for word-lattice representation of an input sentence, in which the compositional vs. MWE interpretation of a word sequence is represented jointly (Constant et al, 2013). The impact of lexical resources on MWEI is explicitly addressed by Riedl and Biemann (2016).…”
Section: Mwe Lexicons In Mwe Identificationmentioning
confidence: 99%
“…In sequence tagging MWEI methods, such resources can be used as sources of lexical features (Schneider et al, 2014). In parsing-based approaches they may serve as a basis for word-lattice representation of an input sentence, in which the compositional vs. MWE interpretation of a word sequence is represented jointly (Constant et al, 2013). The impact of lexical resources on MWEI is explicitly addressed by Riedl and Biemann (2016).…”
Section: Mwe Lexicons In Mwe Identificationmentioning
confidence: 99%
“…For a more complete survey on phraseology discovery, the different proposed methods and their performances, we refer to Evert (2004); Pecina (2008); Manning and Schütze (1999); McKeown and Radev (1999); Baldwin and Kim (2010); Seretan (2011); Ramisch (2015). In addition to monolingual discovery, other tasks have also been investigated in computational linguistics, such as bilingual phraseology discovery (Ha et al, 2008;Morin and Daille, 2010;Weller and Heid, 2012;Rivera et al, 2013), automatic interpretation and disambiguation of multiword expressions (Fazly et al, 2009) and their integration into applications such as parsing (Constant et al, 2013) and machine translation (Carpuat and Diab, 2010). For further reading, we recommend the proceedings of the annual workshop on multiword expressions (Markantonatou et al, 2017), 3 as well as journal special issues on the topic (Villavicencio et al, 2005;Rayson et al, 2010;Bond et al, 2013;Ramisch et al, 2013).…”
Section: Computational Phraseology Discoverymentioning
confidence: 99%
“…Rule-based matching, supervised classification, sequence tagging, and parsing are among the most popular models for MWE identification (Constant et al, 2017). Parsing-based methods take the (recursive) structure of language into account, trying to identify MWEs as a by-product of parsing Constant et al, 2013), or jointly (Constant and Nivre, 2016). Sequence tagging models, on the other hand, consider only linear context, using models such as CRFs (Vincze et al, 2011;Shigeto et al, 2013;Riedl and Biemann, 2016) and averaged perceptron (Schneider et al, 2014) combined with some variant of begin-inside-outside (BIO) encoding (Ramshaw and Marcus, 1995).…”
Section: Related Workmentioning
confidence: 99%
“…For many years, MWE identification was considered unrealistic, with most MWE research focusing on out-of-context MWE discovery (Ramisch et al, 2013). Indeed, the availability of MWE-annotated corpora was limited to some treebanks with partial annotations, often a by-product of syntax trees Constant et al, 2013). This prevented the widespread development and evaluation of MWE identification systems, as compared to other tasks such as POS tagging and named entity recognition.…”
Section: Introductionmentioning
confidence: 99%