Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.232
|View full text |Cite
|
Sign up to set email alerts
|

Supertagging-based Parsing with Linear Context-free Rewriting Systems

Abstract: We present the first supertagging-based parser for linear context-free rewriting systems (LCFRS). It utilizes neural classifiers and outperforms previous LCFRS-based parsers in both accuracy and parsing speed by a wide margin. Our results keep up with the best (general) discontinuous parsers, particularly the scores for discontinuous constituents establish a new state of the art. The heart of our approach is an efficient lexicalization procedure which induces a lexical LCFRS from any discontinuous treebank. We… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 27 publications
0
2
0
Order By: Relevance
“…Overall, our approach delivers competitive accuracies, outperforming recent task-specific discontinuous parsers (such as Ruprecht and Mörbitz (2021) in TIGER and DPTB) and excelling in DPTB (where we achieve the best F-score and Discontinuous F-score to date). It can be also noticed that the sequence tagging strategy (enhanced with the attention mechanism provided by fully fine-tuning the language model BERT) by , also included in Table 3 as for the continuous version, is clearly outperformed in continuous and discontinuous benchmarks by our sequence-to-sequence model, which uses non-fine-tuned word embeddings.…”
Section: Discontinuous Parsingmentioning
confidence: 78%
See 1 more Smart Citation
“…Overall, our approach delivers competitive accuracies, outperforming recent task-specific discontinuous parsers (such as Ruprecht and Mörbitz (2021) in TIGER and DPTB) and excelling in DPTB (where we achieve the best F-score and Discontinuous F-score to date). It can be also noticed that the sequence tagging strategy (enhanced with the attention mechanism provided by fully fine-tuning the language model BERT) by , also included in Table 3 as for the continuous version, is clearly outperformed in continuous and discontinuous benchmarks by our sequence-to-sequence model, which uses non-fine-tuned word embeddings.…”
Section: Discontinuous Parsingmentioning
confidence: 78%
“…For instance, Stanojević and Steedman (2020) and Corro (2020) speed up decoding by not explicitly defining a set of rules and using a span-based scoring algorithm (Stern, Andreas and Klein, 2017a). Additionally, Ruprecht and Mörbitz (2021) present the first suppertagging-based parser for LCFRS that notably reduces parsing time.…”
Section: Discontinuous Constituent Parsingmentioning
confidence: 99%
“…Although continuous approaches ignore these linguistic phenomena by, for instance, removing them from the original treebank (a common practice in the Penn Treebank (Marcus et al, 1993)), there exist different algorithms that can handle discontinuous parsing. Currently, we can highlight (1) those based in Linear Context-Free Rewriting Systems (LCFRS) (Vijay-Shanker et al, 1987), which allow exact CKY-style parsing of discontinuous structures at a high computational cost (Gebhardt, 2020;Ruprecht and Mörbitz, 2021); (2) a variant of the former that, while still making use of LCFRS formalisms, increases parsing speed by implementing a span-based scoring algorithm (Stern et al, 2017) and not explicitly defining a set of rules (Stanojević and Steedman, 2020;Corro, 2020); (3) transition-based parsers that deal with discontinuities by adding a specific transition in charge of changing token order (Versley, 2014;Maier, 2015;Maier and Lichte, 2016;Stanojević and Alhama, 2017;Coavoux and Crabbé, 2017) or by designing new data structures that allow interactions between already-created non-adjacent subtrees ; and, finally, (4) several approaches that reduce discontinuous constituent parsing to a simpler problem, converting it, for instance, into a non-projective dependency parsing task (Fernández-González and Martins, 2015;Fernández-González and Gómez-Rodríguez, 2020a) or into a sequence labelling problem (Vilares and Gómez-Rodríguez, 2020). In (4), we can also include the solutions proposed by Boyd (2007) and Versley (2016), which transform discontinuous treebanks into continuous variants where discontinuous constituents are encoded by creating additional constituent nodes and extending the original non-terminal label set (following a pseudo-projective technique (Nivre and Nilsson, 2005)), to then be processed by continuous parsing models and discontinuities recovered in a postprocessing step.…”
Section: Introductionmentioning
confidence: 99%