Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism

Cao, Pengfei; Chen, Yubo; Liu, Kang; Zhao, Jun; Liu, Shengping

doi:10.18653/v1/d18-1017

Cited by 170 publications

(138 citation statements)

References 24 publications

Supporting

Mentioning

128

Contrasting

Unclassified

Order By: Relevance

“…Zhou et al [27] formulate NER as a joint identification to recognize entity-level features, which effectively improves performance. And Cao et al [2] also use the information of CWS for NER. Zhang et al [24] and Ding et al [5] add additional features, and the latter achieve 94.4% F1-score.…”

Section: Comparison With Previous Workmentioning

confidence: 99%

See 1 more Smart Citation

A Mixed Semantic Features Model for Chinese NER with Characters and Words

Chang

Jiang

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Named Entity Recognition (NER) is an essential part of many natural language processing (NLP) tasks. The existing Chinese NER methods are mostly based on word segmentation, or use the character sequences as input. However, using a single granularity representation would suffer from the problems of out-of-vocabulary and word segmentation errors, and the semantic content is relatively simple. In this paper, we introduce the self-attention mechanism into the BiLSTM-CRF neural network structure for Chinese named entity recognition with two embedding. Different from other models, our method combines character and word features at the sequence level, and the attention mechanism computes similarity on the total sequence consisted of characters and words. The character semantic information and the structure of words work together to improve the accuracy of word boundary segmentation and solve the problem of long-phrase combination. We validate our model on MSRA and Weibo corpora, and experiments demonstrate that our model can significantly improve the performance of the Chinese NER task.

show abstract

Section: Comparison With Previous Workmentioning

confidence: 99%

“…In order to take advantage of both character-level semantic information and word structure content, some models mix word embedding and its corresponding character vectors, and then feed mixed representation into neural network for NER [2,22,26]. The generic model mentioned above is shown in Fig.…”

Section: Introductionmentioning

confidence: 99%

A Mixed Semantic Features Model for Chinese NER with Characters and Words

Chang

Jiang

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Existing state-of-the-art systems include Peng and Dredze (2016), He and Sun (2017b), Cao et al (2018) and Zhang and Yang (2018), which leverage rich external data like cross-domain data, semi-supervised data, and lexicons, or joint-train NER and Chinese Word Segmentation (CWS). 4 In the first block of Table 2, we report the performance of the latest models.…”

Section: Weibo Datasetmentioning

confidence: 99%

“…This method allows the model to dynamically decide which source of information to use for each word, and therefore outperforming the concatenation method used in previous work. More recently, Tan et al (2018b) and Cao et al (2018) employ self-attention to directly capture the global dependencies of the inputs for NER tasks and demonstrate the effectiveness of self-attention in Chinese NER.…”

Section: Attention Mechanismmentioning

confidence: 99%

Untitled

Zhu¹,

Wang²

2019

Proceedings of the 2019 Conference of the North

View full text Add to dashboard Cite

Named entity recognition (NER) is a common task in Natural Language Processing (NLP), but it remains more challenging in Chinese because of its lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually necessary as the first step for Chinese NER. However, models based on wordlevel embeddings and lexicon features often suffer from segmentation errors and out-ofvocabulary (OOV) problems. In this paper, we investigate a Convolutional Attention Network (CAN) for Chinese NER, which consists of a character-based convolutional neural network (CNN) with local-attention layer and a gated recurrent unit (GRU) with global self-attention layer to capture the information from adjacent characters and sentence contexts. Moreover, differently from other approaches, CAN-NER does not depend on any external resources like lexicons and employing small-size char embeddings makes CAN-NER more practical for real systems scenarios. Extensive experimental results show that our approach outperforms state-of-the-art methods without word embedding and external lexicon resources on different domains datasets.

show abstract

“…The use of the attention mechanism achieved great performance in machine translation by [20], setting off a new wave in the NLP field. The utilization of attention mechanism by [21] in F1-score reached 90.64% in the SIGHAN data set.…”

Section: Introductionmentioning

confidence: 97%

CWPC_BiAtt: Character–Word–Position Combined BiLSTM-Attention for Chinese Named Entity Recognition

2020

View full text Add to dashboard Cite

Usually taken as linguistic features by Part-Of-Speech (POS) tagging, Named Entity Recognition (NER) is a major task in Natural Language Processing (NLP). In this paper, we put forward a new comprehensive-embedding, considering three aspects, namely character-embedding, word-embedding, and pos-embedding stitched in the order we give, and thus get their dependencies, based on which we propose a new Character–Word–Position Combined BiLSTM-Attention (CWPC_BiAtt) for the Chinese NER task. Comprehensive-embedding via the Bidirectional Llong Short-Term Memory (BiLSTM) layer can get the connection between the historical and future information, and then employ the attention mechanism to capture the connection between the content of the sentence at the current position and that at any location. Finally, we utilize Conditional Random Field (CRF) to decode the entire tagging sequence. Experiments show that CWPC_BiAtt model we proposed is well qualified for the NER task on Microsoft Research Asia (MSRA) dataset and Weibo NER corpus. A high precision and recall were obtained, which verified the stability of the model. Position-embedding in comprehensive-embedding can compensate for attention-mechanism to provide position information for the disordered sequence, which shows that comprehensive-embedding has completeness. Looking at the entire model, our proposed CWPC_BiAtt has three distinct characteristics: completeness, simplicity, and stability. Our proposed CWPC_BiAtt model achieved the highest F-score, achieving the state-of-the-art performance in the MSRA dataset and Weibo NER corpus.

show abstract

Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism

Cited by 170 publications

References 24 publications

A Mixed Semantic Features Model for Chinese NER with Characters and Words

A Mixed Semantic Features Model for Chinese NER with Characters and Words

Untitled

CWPC_BiAtt: Character–Word–Position Combined BiLSTM-Attention for Chinese Named Entity Recognition

Contact Info

Product

Resources

About