Supertagging-based Parsing with Linear Context-free Rewriting Systems

Ruprecht, Thomas; Mörbitz, Richard

doi:10.18653/v1/2021.naacl-main.232

Cited by 5 publications

(3 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Overall, our approach delivers competitive accuracies, outperforming recent task-specific discontinuous parsers (such as Ruprecht and Mörbitz (2021) in TIGER and DPTB) and excelling in DPTB (where we achieve the best F-score and Discontinuous F-score to date). It can be also noticed that the sequence tagging strategy (enhanced with the attention mechanism provided by fully fine-tuning the language model BERT) by , also included in Table 3 as for the continuous version, is clearly outperformed in continuous and discontinuous benchmarks by our sequence-to-sequence model, which uses non-fine-tuned word embeddings.…”

Section: Discontinuous Parsingmentioning

confidence: 78%

“…For instance, Stanojević and Steedman (2020) and Corro (2020) speed up decoding by not explicitly defining a set of rules and using a span-based scoring algorithm (Stern, Andreas and Klein, 2017a). Additionally, Ruprecht and Mörbitz (2021) present the first suppertagging-based parser for LCFRS that notably reduces parsing time.…”

Section: Discontinuous Constituent Parsingmentioning

confidence: 99%

See 1 more Smart Citation

Discontinuous Grammar as a Foreign Language

Fernández-González¹,

Gómez-Rodríguez²

2021

Preprint

View full text Add to dashboard Cite

In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech. One of the most recent proposals is the use of standard sequence-to-sequence models to perform constituent parsing as a machine translation task, instead of applying task-specific parsers. While they show a competitive performance, these text-to-parse transducers are still lagging behind classic techniques in terms of accuracy, coverage and speed. To close the gap, we here extend the framework of sequence-to-sequence models for constituent parsing, not only by providing a more powerful neural architecture for improving their performance, but also by enlarging their coverage to handle the most complex syntactic phenomena: discontinuous structures. To that end, we design several novel linearizations that can fully produce discontinuities and, for the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks, obtaining competitive results on par with task-specific discontinuous constituent parsers and achieving state-of-the-art scores on the (discontinuous) English Penn Treebank.

show abstract

Section: Discontinuous Parsingmentioning

confidence: 78%

Section: Discontinuous Constituent Parsingmentioning

confidence: 99%

Discontinuous Grammar as a Foreign Language

Fernández-González¹,

Gómez-Rodríguez²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Although continuous approaches ignore these linguistic phenomena by, for instance, removing them from the original treebank (a common practice in the Penn Treebank (Marcus et al, 1993)), there exist different algorithms that can handle discontinuous parsing. Currently, we can highlight (1) those based in Linear Context-Free Rewriting Systems (LCFRS) (Vijay-Shanker et al, 1987), which allow exact CKY-style parsing of discontinuous structures at a high computational cost (Gebhardt, 2020;Ruprecht and Mörbitz, 2021); (2) a variant of the former that, while still making use of LCFRS formalisms, increases parsing speed by implementing a span-based scoring algorithm (Stern et al, 2017) and not explicitly defining a set of rules (Stanojević and Steedman, 2020;Corro, 2020); (3) transition-based parsers that deal with discontinuities by adding a specific transition in charge of changing token order (Versley, 2014;Maier, 2015;Maier and Lichte, 2016;Stanojević and Alhama, 2017;Coavoux and Crabbé, 2017) or by designing new data structures that allow interactions between already-created non-adjacent subtrees ; and, finally, (4) several approaches that reduce discontinuous constituent parsing to a simpler problem, converting it, for instance, into a non-projective dependency parsing task (Fernández-González and Martins, 2015;Fernández-González and Gómez-Rodríguez, 2020a) or into a sequence labelling problem (Vilares and Gómez-Rodríguez, 2020). In (4), we can also include the solutions proposed by Boyd (2007) and Versley (2016), which transform discontinuous treebanks into continuous variants where discontinuous constituents are encoded by creating additional constituent nodes and extending the original non-terminal label set (following a pseudo-projective technique (Nivre and Nilsson, 2005)), to then be processed by continuous parsing models and discontinuities recovered in a postprocessing step.…”

Section: Introductionmentioning

confidence: 99%

Reducing Discontinuous to Continuous Parsing with Pointer Network Reordering

Fernández-González

Gómez-Rodríguez

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Discontinuous constituent parsers have always lagged behind continuous approaches in terms of accuracy and speed, as the presence of constituents with discontinuous yield introduces extra complexity to the task. However, a discontinuous tree can be converted into a continuous variant by reordering tokens. Based on that, we propose to reduce discontinuous parsing to a continuous problem, which can then be directly solved by any off-the-shelf continuous parser. To that end, we develop a Pointer Network capable of accurately generating the continuous token arrangement for a given input sentence and define a bijective function to recover the original order. Experiments on the main benchmarks with two continuous parsers prove that our approach is on par in accuracy with purely discontinuous state-of-the-art algorithms, but considerably faster.

show abstract

Discontinuous grammar as a foreign language

Fernández-González

Gómez-Rodríguez

2023

Neurocomputing

View full text Add to dashboard Cite

Supertagging-based Parsing with Linear Context-free Rewriting Systems

Cited by 5 publications

References 27 publications

Discontinuous Grammar as a Foreign Language

Discontinuous Grammar as a Foreign Language

Reducing Discontinuous to Continuous Parsing with Pointer Network Reordering

Discontinuous grammar as a foreign language

Contact Info

Product

Resources

About