General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widelystudied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and finetuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model. INDEX TERMS Natural language processing, text classification, bidirectional encoder representations from transformer, neural networks, language model.
Matching an appropriate response with its multi-turn context is a crucial challenge in retrievalbased chatbots. Current studies construct multiple representations of context and response to facilitate response selection, but they use these representations in isolation and ignore the relationships among representations. To address these problems, we propose a hierarchical aggregation network of multirepresentation (HAMR) to leverage abundant representations sufficiently and enhance valuable information. First, we employ bidirectional recurrent neural networks (BiRNN) to extract syntactic and semantic representations of sentences and use a self-aggregation mechanism to combine these representations. Second, we design a matching aggregation mechanism for fusing different matching information between each utterance in context and response, which is generated by an attention mechanism. By considering the candidate response as the real part of the context, we try to integrate all of them in chronological order and then accumulate the vectors to calculate the final matching degree. An extensive empirical study on two multi-turn response selection data sets indicates that our proposed model achieves a new state-of-the-art result.
Existing feature-based neural approaches for aspect-based sentiment analysis (ABSA) try to improve their performance with pre-trained word embeddings and by modeling the relations between the text sequence and the aspect (or category), thus heavily depending on the quality of word embeddings and task-specific architectures. Although the recently pre-trained language models, i.e., BERT and XLNet, have achieved state-of-the-art performance in a variety of natural language processing (NLP) tasks, they still subject to the aspect-specific, local feature-aware and task-agnostic challenges. To address these challenges, this paper proposes a XLNet and capsule network based model XLNetCN for ABSA. XLNetCN firstly constructs auxiliary sentence to model the sequence-aspect relation and generate global aspect-specific representations, which enables to enhance aspect-awareness and ensure the full pre-training of XLNet for improving task-agnostic capability. After that, XLNetCN also employs a capsule network with the dynamic routing algorithm to extract the local and spatial hierarchical relations of the text sequence, and yield its local feature representations, which are then merged with the global aspect-related representations for downstream classification via a softmax classifier. Experimental results show that XLNetCN outperforms significantly than the classical BERT, XLNet and traditional feature-based approaches on the two benchmark datasets of SemEval 2014, Laptop and Restaurant, and achieves new state-of-the-art results. INDEX TERMS Aspect-based sentiment analysis, natural language processing, text analysis, deep learning, neural network.
Distantly supervised relation classification aims at identifying the relationship between two given entities and plays an essential part in natural language processing (NLP). Although distant supervision is able to generate labeled data automatically, it is facing with the problem of noisy data due to the wrong labeling problems. The attention mechanism is one of the most popular methods to reduce the influence of mislabeled data. However, regardless of the correlation among relations, the most existing methods treat all relationships as independent classes. In general, the definitions of relations contain rich semantic information, which improves the performance of the model, especially when classifying long-tail relations which lacks training data. Based on this idea, we propose a novel neural network architecture with an attention mechanism in this paper. First, we use bidirectional GRU to encode relation definitions as the context representations of relations. Then, we use the merge attention mechanism to make full use of the hidden states obtained by the GRU. To help the model make full use of the context of the entities, we also introduce semantic weights, calculated by the length of the shortest path between entities and words in the dependency tree. We conduct experiments on the widely used New York Times relation extraction corpus, and the results demonstrate that our model outperforms most of the state-of-the-art models.
As a typical unsupervised learning method, the TextRank algorithm performs well for large-scale text mining, especially for automatic summarization or keyword extraction. However, TextRank only considers the similarities between sentences in the processes of automatic summarization and neglects information about text structure and context. To overcome these shortcomings, the authors propose an improved highly-scalable method, called iTextRank. When building a TextRank graph in their new method, the authors compute sentence similarities and adjust the weights of nodes by considering statistical and linguistic features, such as similarities in titles, paragraph structures, special sentences, sentence positions and lengths. Their analysis shows that the time complexity of iTextRank is comparable with TextRank. More importantly, two experiments show that iTextRank has a higher accuracy and lower recall rate than TextRank, and it is as effective as several popular online automatic summarization systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.