2021
DOI: 10.48550/arxiv.2104.07204
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models

Abstract: Chinese pre-trained language models usually process text as a sequence of characters, while ignoring more coarse granularity, e.g., words. In this work, we propose a novel pre-training paradigm for Chinese -Lattice-BERT, which explicitly incorporates word representations along with characters, thus can model a sentence in a multi-granularity manner. Specifically, we construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers. We design a lattice positio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 23 publications
(16 reference statements)
0
7
0
Order By: Relevance
“…Existing Chinese BERT models that incorporate word information can be divided into two categories. The first category uses word information in the pretraining stage but represents a text as a sequence of characters when the pretrained model is applied to downstream tasks (Cui et al, 2019a;Lai et al, 2021). The second category uses word information when the pretrained model is used in downstream tasks (Su, 2020;Guo et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Existing Chinese BERT models that incorporate word information can be divided into two categories. The first category uses word information in the pretraining stage but represents a text as a sequence of characters when the pretrained model is applied to downstream tasks (Cui et al, 2019a;Lai et al, 2021). The second category uses word information when the pretrained model is used in downstream tasks (Su, 2020;Guo et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…• Lattice BERT: Lai et al (2021) uses lexicons to enhance the character-level encodings (left side of the encoder in Figure . 3 (b) ). It uses the parallel structure in the transformers to discriminate characters and additional lexicons.…”
Section: Appendix a Word-level Chinese Bert Modelsmentioning
confidence: 99%
See 2 more Smart Citations
“…Language features are considered in more recent works. For example, AMBERT (Zhang and Li, 2020) and Lattice-BERT (Lai et al, 2021) both take word information into consideration. Chinese-BERT (Sun et al, 2021) utilizes pinyin and glyph of characters.…”
Section: Related Workmentioning
confidence: 99%