Learning in non-stationary environments is not an easy task and requires a distinctive approach. The learning model must not only have the ability to continuously learn, but also the ability to acquired new concepts and forget the old ones. Additionally, given the significant importance that social networks gained as information networks, there is an evergrowing interest in the extraction of complex information used for trend detection, promoting services or market sensing. This dynamic nature tends to limit the performance of traditional static learning models and dynamic learning strategies must be put forward.In this paper we present a learning strategy to learn with drift in the occurrence of concepts in Twitter. We propose three different models: a time-window model, an ensemble-based model and an incremental model. Since little is known about the types of drift that can occur in Twitter, we simulate different types of drift by artificially timestamping real Twitter messages in order to evaluate and validate our strategy. Results are so far encouraging regarding learning in the presence of drift, along with classifying messages in Twitter streams.
Abstract. Given the wide spread of social networks, research efforts to retrieve information using tagging from social networks communications have increased. In particular, in Twitter social network, hashtags are widely used to define a shared context for events or topics. While this is a common practice often the hashtags freely introduced by the user become easily biased. In this paper, we propose to deal with this bias defining semantic meta-hashtags by clustering similar messages to improve the classification. First, we use the user-defined hashtags as the Twitter message class labels. Then, we apply the meta-hashtag approach to boost the performance of the message classification.The meta-hashtag approach is tested in a Twitter-based dataset constructed by requesting public tweets to the Twitter API. The experimental results yielded by comparing a baseline model based on user-defined hashtags with the clustered meta-hashtag approach show that the overall classification is improved. It is concluded that by incorporating semantics in the meta-hashtag model can have impact in different applications, e.g. recommendation systems, event detection or crowdsourcing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.