“…Past work has proposed incorporating other data structures specialized for hierarchical patterns into neural networks, including context-free grammars (Kim et al, 2019a,b;Kim, 2021), trees (Tai et al, 2015;Zhu et al, 2015;Kim et al, 2017;Choi et al, 2018;Havrylov et al, 2019;Corro and Titov, 2019;Xu et al, 2021), chart parsers (Le and Zuidema, 2015;Maillard et al, 2017;Drozdov et al, 2019;Maveli and Cohen, 2022), and transition-based parsers (Dyer et al, 2015;Bowman et al, 2016;Dyer et al, 2016;Shen et al, 2019a).…”