Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.425
|View full text |Cite
|
Sign up to set email alerts
|

ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations

Abstract: The pre-training of text encoders normally processes text as a sequence of tokens corresponding to small text units, such as word pieces in English and characters in Chinese. It omits information carried by larger text granularity, and thus the encoders cannot easily adapt to certain combinations of characters. This leads to a loss of important semantic information, which is especially problematic for Chinese because the language does not have explicit word boundaries. In this paper, we propose ZEN, a BERT-bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
50
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
1

Relationship

3
5

Authors

Journals

citations
Cited by 80 publications
(53 citation statements)
references
References 35 publications
0
50
0
1
Order By: Relevance
“…Extra knowledge (e.g., pre-trained embeddings (Song et al, 2017;Song and Shi, 2018; and pretrained models (Devlin et al, 2019;Diao et al, 2019)) can provide useful information and thus enhance model performance for many NLP tasks (Tian et al, 2020a,b,c). Specifically, memory and memory-augmented neural networks (Zeng et al, 2018;Santoro et al, 2018;Diao et al, 2020;Tian et al, 2020d) are another line of related research, which can be traced back to , which proposed memory networks to leverage extra information for question answering; then Sukhbaatar et al (2015) improved it with an end-to-end design to ensure the model being trained with less supervision.…”
Section: Base+rm+mclnmentioning
confidence: 99%
“…Extra knowledge (e.g., pre-trained embeddings (Song et al, 2017;Song and Shi, 2018; and pretrained models (Devlin et al, 2019;Diao et al, 2019)) can provide useful information and thus enhance model performance for many NLP tasks (Tian et al, 2020a,b,c). Specifically, memory and memory-augmented neural networks (Zeng et al, 2018;Santoro et al, 2018;Diao et al, 2020;Tian et al, 2020d) are another line of related research, which can be traced back to , which proposed memory networks to leverage extra information for question answering; then Sukhbaatar et al (2015) improved it with an end-to-end design to ensure the model being trained with less supervision.…”
Section: Base+rm+mclnmentioning
confidence: 99%
“…In our main experiments, we use two types of embeddings for each language: ELMo (Peters et al, 2018) and BERT-cased large (Devlin et al, 2019) for English, and Tencent Embedding (Song et al, 2018b) and ZEN (Diao et al, 2019) for Chinese. In Table 5, we report the results (F 1 scores) of our model with the best setting (i.e.…”
Section: Discussionmentioning
confidence: 99%
“…From the results, it is found that our model with AU and GA can consistently outperforms the baseline models with different settings of embeddings. In our main experiments, we use ZEN (Diao et al, 2019) instead of BERT (Devlin et al, 2019) as the embedding to represent the input for Chinese. The reason is that ZEN achieves better performance compared with BERT, which is confirmed by Table 6 with its results (F 1 scores) showing the performance of our approach with the best settings (i.e.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In our experiments, we use BERT (Devlin et al, 2019) as the basic encoder for all three languages and use ZEN (Diao et al, 2019) and XLNet-large (Yang et al, 2019) for Chinese and English, respectively. 11 For BERT, ZEN, and XLNet, we use the default hyper-parameter settings.…”
Section: Model Implementationmentioning
confidence: 99%