Furu Wei scite author profile

We present a method that learns word embedding for Twitter sentiment classification in this paper. Most existing algorithms for learning continuous word representations typically only model the syntactic context of words but ignore the sentiment of text. This is problematic for sentiment analysis as they usually map words with similar syntactic context but opposite sentiment polarity, such as good and bad, to neighboring word vectors. We address this issue by learning sentimentspecific word embedding (SSWE), which encodes sentiment information in the continuous representation of words. Specifically, we develop three neural networks to effectively incorporate the supervision from sentiment polarity of text (e.g. sentences or tweets) in their loss functions. To obtain large scale training corpora, we learn the sentiment-specific word embedding from massive distant-supervised tweets collected by positive and negative emoticons. Experiments on applying SS-WE to a benchmark Twitter sentiment classification dataset in SemEval 2013 show that (1) the SSWE feature performs comparably with hand-crafted features in the top-performed system; (2) the performance is further improved by concatenating SSWE with existing feature set.

show abstract

BEiT: BERT Pre-Training of Image Transformers

Bao¹,

Dong²,

Wei³

2021

Preprint

227

588

View full text Add to dashboard Cite

We introduce a self-supervised vision representation model BEIT, which stands for Bidirectional Encoder representation from Image Transformers. Following BERT (Devlin et al., 2019) developed in the natural language processing area, we propose a masked image modeling task to pretrain vision Transformers. Specifically, each image has two views in our pre-training, i.e, image patches (such as 16 × 16 pixels), and visual tokens (i.e., discrete tokens). We first "tokenize" the original image into visual tokens. Then we randomly mask some image patches and fed them into the backbone Transformer. The pre-training objective is to recover the original visual tokens based on the corrupted image patches. After pre-training BEIT, we directly fine-tune the model parameters on downstream tasks by appending task layers upon the pretrained encoder. Experimental results on image classification and semantic segmentation show that our model achieves competitive results with previous pre-training methods. For example, base-size BEIT achieves 83.2% top-1 accuracy on ImageNet-1K, significantly outperforming from-scratch DeiT training (81.8%; Touvron et al., 2020) with the same setup. Moreover, large-size BEIT obtains 86.3% only using ImageNet-1K, even outperforming ViT-L with supervised pre-training on ImageNet-22K (85.2%;Dosovitskiy et al., 2020). The code and pretrained models are available at https://aka.ms/beit.

show abstract

Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification

Dong¹,

Wei²,

Tan³

et al. 2014

798

520

View full text Add to dashboard Cite

We propose Adaptive Recursive Neural Network (AdaRNN) for target-dependent Twitter sentiment classification. AdaRNN adaptively propagates the sentiments of words to target depending on the context and syntactic relationships between them. It consists of more than one composition functions, and we model the adaptive sentiment propagations as distributions over these composition functions. The experimental studies illustrate that AdaRNN improves the baseline methods. Furthermore, we introduce a manually annotated dataset for target-dependent Twitter sentiment analysis.

show abstract

Gated Self-Matching Networks for Reading Comprehension and Question Answering

et al. 2017

View full text Add to dashboard Cite

In this paper, we present the gated selfmatching networks for reading comprehension style question answering, which aims to answer questions from a given passage. We first match the question and passage with gated attention-based recurrent networks to obtain the question-aware passage representation. Then we propose a self-matching attention mechanism to refine the representation by matching the passage against itself, which effectively encodes information from the whole passage. We finally employ the pointer networks to locate the positions of answers from the passages. We conduct extensive experiments on the SQuAD dataset. The single model achieves 71.3% on the evaluation metrics of exact match on the hidden test set, while the ensemble model further boosts the results to 75.9%. At the time of submission of the paper, our model holds the first place on the SQuAD leaderboard for both single and ensemble model.

show abstract

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Cui

et al. 2020

368

301

View full text Add to dashboard Cite

Automatic information extraction from identity documents is a fundamental task in digital processes such as onboarding, requesting products, identity validation, among others. The information extraction process consists of identifying, locating, classifying and recognizing text of the corresponding key fields that an identity document contains. In the case of identity documents, key fields are: names, last names, document number, dates, among others.The information extraction problem has been traditionally solved using rule based algorithms and classic OCR engines. In the last few years there have been implementations based on machine learning models, using NLP (natural language processing) and CV (computer vision) to solve the problem in a more flexible and efficient way (Subramani et al., 2020). This work proposes to solve the problem of information extraction with an object detection approach. An object detection model based on transformers (Carion et al., 2020) was implemented, trained and evaluated. A solution with above 95% accuracy in detecting key fields on identification documents was achieved.

show abstract

Question Answering over Freebase with Multi-Column Convolutional Neural Networks

et al. 2015

View full text Add to dashboard Cite

Answering natural language questions over a knowledge base is an important and challenging task. Most of existing systems typically rely on hand-crafted features and rules to conduct question understanding and/or answer ranking. In this paper, we introduce multi-column convolutional neural networks (MCCNNs) to understand questions from three different aspects (namely, answer path, answer context, and answer type) and learn their distributed representations. Meanwhile, we jointly learn low-dimensional embeddings of entities and relations in the knowledge base. Question-answer pairs are used to train the model to rank candidate answers. We also leverage question paraphrases to train the column networks in a multi-task learning manner. We use FREEBASE as the knowledge base and conduct extensive experiments on the WEBQUESTIONS dataset. Experimental results show that our method achieves better or comparable performance compared with baseline systems. In addition, we develop a method to compute the salience scores of question words in different column networks. The results help us intuitively understand what MCCNNs learn.

show abstract

Neural Question Generation from Text: A Preliminary Study

et al. 2018

View full text Add to dashboard Cite

Automatic question generation aims to generate questions from a text passage where the generated questions can be answered by certain sub-spans of the given passage. Traditional methods mainly use rigid heuristic rules to transform a sentence into related questions. In this work, we propose to apply the neural encoderdecoder model to generate meaningful and diverse questions from natural language sentences. The encoder reads the input text and the answer position, to produce an answer-aware input representation, which is fed to the decoder to generate an answer focused question. We conduct a preliminary study on neural question generation from text with the SQuAD dataset, and the experiment results show that our method can produce fluent and diverse questions.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Furu Wei

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification

BEiT: BERT Pre-Training of Image Transformers

Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification

Gated Self-Matching Networks for Reading Comprehension and Question Answering

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Question Answering over Freebase with Multi-Column Convolutional Neural Networks

Neural Question Generation from Text: A Preliminary Study

Contact Info

Product

Resources

About