Abstract-Community detection is a fundamental task in social network analysis. In this paper, first we develop an endorsement filtered user connectivity network by utilizing Heider's structural balance theory and certain Twitter triad patterns. Next, we develop three Nonnegative Matrix Factorization frameworks to investigate the contributions of different types of user connectivity and content information in community detection. We show that user content and endorsement filtered connectivity information are complementary to each other in clustering politically motivated users into pure political communities. Word usage is the strongest indicator of users' political orientation among all content categories. Incorporating user-word matrix and word similarity regularizer provides the missing link in connectivityonly methods which suffer from detection of artificially large number of clusters for Twitter networks.
In recent years, using cell phone log data to model human mobility patterns became an active research area. This problem is a challenging data mining problem due to huge size and the non-uniformity of the log data, which introduces several granularity levels for the specification of temporal and spatial dimensions. This paper focuses on the prediction of the location of the next activity of the mobile phone users. There are several versions of this problem. In this work, we have concentrated on the following three problems: Predicting the location and the time of the next user activity, predicting the location of the next activity of the user when the location of the user changes, and predicting both the location and the time of the activity of the user when the user's location changes. We have developed sequential pattern mining based techniques for these three problems and validated the success of these methods with real data obtained from one of the largest mobile phone operators in Turkey. Our results are very encouraging, since we were able to obtain quite high accuracy results under a small prediction sets.
How is popularity gained online? Is being successful strictly related to rapidly becoming viral in an online platform or is it possible to acquire popularity in a steady and disciplined fashion? What are other temporal characteristics that can unveil popularity of online content? To answer these questions, we leverage a multi-faceted temporal analysis of the evolution of popular online contents. Here, we present dipm-SC: a multi-dimensional shape-based time-series clustering algorithm with a heuristic to find the optimal number of clusters. First, we validate the accuracy of our algorithm on synthetic datasets generated from benchmark time series models. Second, we show that dipm-SC can uncover meaningful clusters of popularity behaviors in a real-world Twitter dataset. By clustering the multidimensional time-series of popularity of contents coupled with other domain-specific dimensions, we uncover two main patterns of popularity: bursty and steady temporal behaviors. Moreover, we find that the way popularity is gained over time has no significant impact on the final cumulative popularity.
Predicting the location of people from their mobile phone logs is becoming an attractive research area. Due to two main reasons this problem is very challenging: the log data is very large and there is a variety of granularity levels both for specifying the location and the time, especially with low granularity level it becomes much more complicated to define common user behaviour patterns. In this work, rather than determining the next location of a person, we focus on the predicting the location of a person when it changes. We employed a two phase method; which first clusters the data to obtain a higher granularity level, and then extracts frequent sequential patterns corresponding to location changes. We have validated our results with real data obtained from one of the largest mobile phone operators in Turkey. Our results are very encouraging, and we have obtained very high accuracy results in predicting the change of location of mobile phone users.
Web 2.0 helps to expand the range and depth of conversation on many issues and facilitates the formation of online communities. Online communities draw various individuals together based on their common opinions on a core set of issues. Most existing community detection methods merely focus on discovering communities without providing any insight regarding the collective opinions of community members and the motives behind the formation of communities. Several efforts have been made to tackle this problem by presenting a set of keywords as a community profile. However, they neglect the positions of community members towards keywords, which play an important role for understanding communities in the highly polarized atmosphere of social media. To this end, we present a sentiment-driven community profiling and detection framework which aims to provide community profiles presenting positive and negative collective opinions of community members separately. With this regard, our framework initially extracts key expressions in users' messages as representative of issues and then identifies users' positive/negative attitudes towards these key expressions. Next, it uncovers a low-dimensional latent space in order to cluster users according to their opinions and social interactions (i.e., retweets). We demonstrate the effectiveness of our framework through quantitative and qualitative evaluations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.