Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.392
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Parsing with S-DIORA: Single Tree Encoding for Deep Inside-Outside Recursive Autoencoders

Abstract: The deep inside-outside recursive autoencoder (DIORA;Drozdov et al. 2019a) is a selfsupervised neural model that learns to induce syntactic tree structures for input sentences without access to labeled training data. In this paper, we discover that while DIORA exhaustively encodes all possible binary trees of a sentence with a soft dynamic program, its vector averaging approach is locally greedy and cannot recover from errors when computing the highest scoring parse tree in bottom-up chart parsing. To fix this… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
30
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(31 citation statements)
references
References 50 publications
1
30
0
Order By: Relevance
“…We compared examples of trees inferred by our model with the corresponding ground truth constituency trees (see Appendix), encountering reasonable structures that are different from the constituent structure posited by the manually defined gold trees. Experimental results of previous work (Drozdov et al, 2020;Kim et al, 2019a) also show significant variance with different random seeds. Thus, we hypothesize that an isomorphy-focused F 1 evaluation with respect to gold constituency trees is insufficient to evaluate how reasonable the induced structures are.…”
Section: Dependency Tree Compatibilitymentioning
confidence: 79%
See 1 more Smart Citation
“…We compared examples of trees inferred by our model with the corresponding ground truth constituency trees (see Appendix), encountering reasonable structures that are different from the constituent structure posited by the manually defined gold trees. Experimental results of previous work (Drozdov et al, 2020;Kim et al, 2019a) also show significant variance with different random seeds. Thus, we hypothesize that an isomorphy-focused F 1 evaluation with respect to gold constituency trees is insufficient to evaluate how reasonable the induced structures are.…”
Section: Dependency Tree Compatibilitymentioning
confidence: 79%
“…Analysis. In order to better understand why our model works better when evaluating on word-piece level golden trees, we compute the recall of constituents following Kim et al (2019b) and Drozdov et al (2020). Besides standard constituents, we also compare the recall of word-piece chunks and proper noun chunks.…”
Section: Resultsmentioning
confidence: 99%
“…Grammar induction using neural networks: There is a recent resurgence of interest in unsupervised constituency parsing, mostly driven by neural network based methods (Shen et al, 2018a(Shen et al, , 2019Drozdov et al, 2019Drozdov et al, , 2020Kim et al, 2019a,b;Jin et al, 2019;Zhu et al, 2020). These methods can be categorized into two major groups: those built on top of a generative grammar and those without a grammar component.…”
Section: Related Workmentioning
confidence: 99%
“…Unsupervised parsing (or grammar induction) trains syntax-dependent models to produce syntactic trees of natural language expressions without direct syntactic annotation (Klein and Manning, 2002;Bod, 2006;Ponvert et al, 2011;Pate and Johnson, 2016;Shen et al, 2018;Kim et al, 2019;Drozdov et al, 2020). Comparing to them, our model learns both syntax and semantics jointly.…”
Section: Unsupervised Parsingmentioning
confidence: 99%