“…We can also conclude that language models with more successful structural knowledge can better help to encode effective intrinsic language patterns, which is consistent with the prior studies (Kim et al, 2019b;Drozdov et al, 2019). We also compare the constituency parsing with state-of-the-art structure-aware models, including 1) Recurrent-based models described in §2: PRPN (Shen et al, 2018a), On-LSTM (Shen et al, 2018b), URNNG (Kim et al, 2019b), DIORA (Drozdov et al, 2019), PCFG (Kim et al, 2019a), and 2) Transformer based methods: Tree+Trm , RvTrm (Ahmed et al, 2019), PI+TrmXL , and the BERT model initialized with rich weights. As shown in Table 2, all the structure-aware models can give good parsing results, compared with non-structured models.…”