Activity-based Twitter sampling for content-based and user-centric prediction models

Aghababaei, Somayyeh; Makrehchi, Masoud

doi:10.1186/s13673-016-0084-z

Cited by 8 publications

(4 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Tweepy is a Python library for accessing the standard real-time streaming Twitter API, 2 which allows to freely retrieve tweets that match a given query. If the query is too broad that it includes over 1% of the total number of tweets posted at that time worldwide, the query's response is sampled (Aghababaei and Makrehchi, 2017;Morstatter et al, 2014). The way in which Twitter samples the data is unpublished.…”

Section: Data Collectionmentioning

confidence: 99%

Regional Differences in Information Privacy Concerns After the Facebook-Cambridge Analytica Data Scandal

González-Pizarro

Figueroa

López

et al. 2022

Comput Supported Coop Work

View full text Add to dashboard Cite

While there is increasing global attention to data privacy, most of their current theoretical understanding is based on research conducted in a few countries. Prior work argues that people's cultural backgrounds might shape their privacy concerns; thus, we could expect people from different world regions to conceptualize them in diverse ways. We collected and analyzed a large-scale dataset of tweets about the #CambridgeAnalytica scandal in Spanish and English to start exploring this hypothesis. We employed word embeddings and qualitative analysis to identify which information privacy concerns are present and characterize language and regional differences in emphasis on these concerns. Our results suggest that related concepts, such as regulations, can be added to current information privacy frameworks. We also observe a greater emphasis on data collection in English than in Spanish. Additionally, data from North America exhibits a narrower focus on awareness compared to other regions under study. Our results call for more diverse sources of data and nuanced analysis of data privacy concerns around the globe.

show abstract

Section: Data Collectionmentioning

confidence: 99%

Regional Differences in Information Privacy Concerns After the Facebook-Cambridge Analytica Data Scandal

González-Pizarro

Figueroa

López

et al. 2022

Comput Supported Coop Work

View full text Add to dashboard Cite

show abstract

“…Feature-based approaches [2,3,[11][12][13][14][15] make the connection between the prediction and various types of hand-crafted features that are extracted from the information cascade, including the structural features of the social network, content features, temporal features, and user features. To predict the popularity of news articles in Yahoo News, Arapakis et al [16] used 10 different features that they extracted from the content of the news articles as well as external sources.…”

Section: Cascade Predictionmentioning

confidence: 99%

“…Having obtained the attended whole structure embedding ṡ and temporal embedding ḣ , we can feed these two embeddings into the inter-gate mechanism to effectively combine these two factors. The proposed inter-gate mechanism can capture the different (11) Fig. 3 Architecture of the Intra-attention Mechanism w.r.t.…”

Section: Inter-gate Mechanismmentioning

confidence: 99%

Information cascades prediction with attention neural network

Liu

Bao

Zhang

et al. 2020

Hum. Cent. Comput. Inf. Sci.

View full text Add to dashboard Cite

Online social networks are very popular among people, and they are changing the way people communicate, work, and play, mostly for the better. One of the things that fascinates us most about social network sites is the resharing mechanism that has the potential to spread information to millions of users in a matter of few hours or days. For instance, a user can share the content (e.g., videos on YouTube, tweets on Twitter, and photos on Flickr) with her set of friends, who subsequently can potentially reshare the content, resulting in the development of a cascade of resharing. Such information cascades play a significant role in almost every social network phenomenon, which include, but are not limited to, the diffusion of innovation, persuasion campaigns, and spreading rumors. Information cascade prediction is to infer some key properties of information cascades, such as their sizes and shapes, which indicate the extent to which the information can reach in the social network. This prediction task can be valuable, and it can be applied in an array of areas, such as content recommender systems and monitoring the consensus opinion. However,

show abstract

“…First, collecting data on all possible Twitter accounts (or any other social media accounts) poses prohibitive storage and bandwidth costs, even absent limitations imposed by the service provider; thus, selection of accounts (including the special case in which a census of a focal account set is attempted) is inevitable. Depending on the method employed (see Aghababaei and Makrehchi 2017), and the manner in which it is implemented, all accounts of interest may not be identified at the same time. This may lead to missingness during the initial observation period.…”

Section: Introductionmentioning

confidence: 99%

Practical Methods for Imputing Follower Count Dynamics

Gibson

Sutton

Vos

et al. 2020

Sociological Methods & Research

View full text Add to dashboard Cite

Microblogging sites have become important data sources for studying network dynamics and information transmission. Both areas of study, however, require accurate counts of indegree, or follower counts; unfortunately, collection of complete time series on follower counts can be limited by application programming interface constraints, system failures, or temporal constraints. In addition, there is almost always a time difference between the point at which follower counts are queried and the time a user posts a tweet. Here, we consider the use of three classes of simple, easily implemented methods for follower imputation: polynomial functions, splines, and generalized linear models. We evaluate the performance of each method via a case study of accounts from 236 health organizations during the 2014 Ebola outbreak. For accurate interpolation and extrapolation, we find that negative binomial regression, modeled separately for each account, using time as an interval variable, accurately recovers missing values while retaining narrow prediction intervals.

show abstract

Activity-based Twitter sampling for content-based and user-centric prediction models

Cited by 8 publications

References 28 publications

Regional Differences in Information Privacy Concerns After the Facebook-Cambridge Analytica Data Scandal

Regional Differences in Information Privacy Concerns After the Facebook-Cambridge Analytica Data Scandal

Information cascades prediction with attention neural network

Practical Methods for Imputing Follower Count Dynamics

Contact Info

Product

Resources

About