In this article, we focus on the problem of social event extraction from Twitter, in which event detection, i.e., to identify which messages truly mention events of interest, is an indispensable step due to the fact that most Twitter messages, viz. tweets, are not related to any real-world event. Existing approaches to this problem often use pipelined architectures relying on some hand-crafted features derived using off-the-shelf natural language processing (NLP) tools, which may cause error propagation from the upstream component (event detection) to the downstream one (element extraction) and fail to leverage the interdependencies between them. To overcome these limitations, we propose a deep neural network based framework to Jointly Detect and Extract Events from Twitter (JDEET), which learns, as well as conducts, detection and extraction simultaneously by defining a joint loss function, a bidirectional long short-term memory (LSTM) based common representation layer, and a control gate. A conditional random field (CRF) layer is further employed to capture the strong dependencies among output labels. Experimental results show that the proposed approach outperforms the state-of-the-art ones considerably on a real-world dataset from Twitter.INDEX TERMS Event extraction, social events, Twitter, joint models, a control gate, deep neural networks.
Social unrest events are common happenings in modern society which need to be proactively handled. An effective method is to continuously assess the risk of upcoming social unrest events and predict the likelihood of these events. Our previous work built a hidden Markov model- (HMM-) based framework to predict indicators associated with country instability, leaving two shortcomings which can be optimized: omitting event participants’ interaction and implicitly learning the state residence time. Inspired by this, we propose a new prediction framework in this paper, using frequent subgraph patterns and hidden semi-Markov models (HSMMs). The feature called BoEAG (Bag-of-Event-Association-subGraph) is constructed based on frequent subgraph mining and the bag of word model. The new framework leverages the large-scale digital history events captured from GDELT (Global Data on Events, Location, and Tone) to characterize the transitional process of the social unrest events’ evolutionary stages, uncovering the underlying event development mechanics and formulating the social unrest event prediction as a sequence classification problem based on Bayes decision. Experimental results with data from five main countries in Southeast Asia demonstrate the effectiveness of the new method, which outperforms the traditional HMM by 5.3% to 16.8% and the logistic regression by 11.2% to 43.6%.
Large-scale pre-trained language models such as BERT have brought much better performance to text classification. However, their large sizes can lead to sometimes prohibitively slow fine-tuning and inference. To alleviate this, various compression methods have been proposed; however, most of these methods solely consider reducing inference time, often ignoring significant increases in training time, and thus are even more resource consuming. In this article, we focus on lottery ticket extraction for the BERT architecture. Inspired by observations that representations at lower layers are often more useful for text classification, we propose that we can identify the winning ticket of BERT for binary text classification through adaptive truncation, i.e., a process that drops the top-k layers of the pre-trained model based on simple, fast computations. In this way, the cost for compressing and fine-tuning, as well as inference, can be vastly reduced. We present experiments on eight mainstream binary text classification datasets covering different input styles (i.e., single-text and text-pair), as well as different typical tasks (e.g., sentiment analysis, acceptability judgement, textual entailment, semantic similarity analysis and natural language inference). Compared with some strong baselines, our method saved 78.1% time and 31.7% memory on average, and up to 86.7 and 48% in extreme cases, respectively. We also saw good performance, often outperforming the original language model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.