A Biased Review of Biases in Twitter Studies on Political Collective Action

Cihon, Peter; Yasseri, Taha

doi:10.3389/fphy.2016.00034

Cited by 35 publications

(30 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Our research has followed a theory‐driven approach, aiming at aspects of Twitter that are relevant for society as a whole and that have been inferred from both qualitative and computational models. This allowed us take a step further from descriptive and data‐driven analyses without theoretical context, framing our findings in a wider scientific perspective beyond the computational sciences (Cihon & Yasseri, ). We learned that popularity and reputation are not always motivating, that popularity is not as heterogeneous as was thought to be, and that popularity and reputation are both relevant when studying social influence.…”

Section: Discussionmentioning

confidence: 99%

Understanding Popularity, Reputation, and Social Influence in the Twitter Society

García¹,

Mavrodiev²,

Casati³

et al. 2017

Policy & Internet

View full text Add to dashboard Cite

The pervasive presence of online media in our society has transferred a significant part of political deliberation to online forums and social networking sites. This article examines popularity, reputation, and social influence on Twitter using large-scale digital traces from 2009 and 2016. We process network information on more than 40 million users, calculating new global measures of reputation that build on the D-core decomposition and the bow-tie structure of the Twitter follower network. We integrate our measurements of popularity, reputation, and social influence to evaluate what keeps users active, what makes them more popular, and what determines their influence. We find that there is a range of values in which the risk of a user becoming inactive grows with popularity and reputation. Popularity in Twitter resembles a proportional growth process that is faster in its strongly connected component, and that can be accelerated by reputation when users are already popular. We find that social influence on Twitter is mainly related to popularity rather than reputation, but that this growth of influence with popularity is sublinear. The explanatory and predictive power of our method shows that global network metrics are better predictors of inactivity and social influence, calling for analyses that go beyond local metrics like the number of followers.

show abstract

Section: Discussionmentioning

confidence: 99%

Understanding Popularity, Reputation, and Social Influence in the Twitter Society

García¹,

Mavrodiev²,

Casati³

et al. 2017

Policy & Internet

View full text Add to dashboard Cite

show abstract

“…We collected Tweets with the hashtag #Charlottesville and the follower lists for 13 media organizations using Twitter's API and the Python package tweepy. Public data accessibility through Twitter's API has greatly facilitated research studies on Twitter data, but such data have important limitations [5,13], including potential biases due to Twitter's proprietary API sampling scheme [13]. For example, Morstatter et al [31] illustrated that the API can produce artifacts in topical tweet volume, potentially resulting in misleading changes in the number of tweets on a given topic over time.…”

Section: Data Collectionmentioning

confidence: 99%

“…It is common to analyze them individually as retweet (e.g., see [8,9]), follower (e.g., see [10]), mention (e.g., see [8]) networks, and others. An extensive literature is concerned with Twitter network data, and the myriad topics that have been studied using them include political protest and social movements [11][12][13][14][15][16][17], epidemiological surveillance and monitoring of health behaviors [18][19][20][21][22][23][24], contagion and online content propagation [25,26], identification of extremist groups [27], ideological polarization [8,28,29], and much more. Indeed, the combination of significance for public discourse, data accessibility, and amenability to network analysis is appealing.…”

Section: Introductionmentioning

confidence: 99%

Online reactions to the 2017 ‘Unite the right’ rally in Charlottesville: measuring polarization in Twitter networks using media followership

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Several studies discuss working, compositions and possible biases of data [47,48] and a "reverse-engineered" model has been developed for the Sample API, which indicates that the sampling is based on a millisecond time window and that the timestamp at which the Tweet arrived at Twitter's servers is coded into the Tweet's ID [42,43]. Although it has been shown that Twitter creates nonrepresentative samples with non-transparent and highly fluctuating sample rates of the overall Twitter activity [49], this has had no effect on its popularity amongst researchers [50]. It was suggested in the past that Sample API data can be used to estimate the quality of Streaming API data [51].…”

Section: Related Workmentioning

confidence: 99%

Tampering with Twitter’s Sample API

Mayer

2018

EPJ Data Sci.

109

View full text Add to dashboard Cite

Social media data is widely analyzed in computational social science. Twitter, one of the largest social media platforms, is used for research, journalism, business, and government to analyze human behavior at scale. Twitter offers data via three different Application Programming Interfaces (APIs). One of which, Twitter's Sample API, provides a freely available 1% and a costly 10% sample of all Tweets. These data are supposedly random samples of all platform activity. However, we demonstrate that, due to the nature of Twitter's sampling mechanism, it is possible to deliberately influence these samples, the extent and content of any topic, and consequently to manipulate the analyses of researchers, journalists, as well as market and political analysts trusting these data sources. Our analysis also reveals that technical artifacts can accidentally skew Twitter's samples. Samples should therefore not be regarded as random. Our findings illustrate the critical limitations and general issues of big data sampling, especially in the context of proprietary data and undisclosed details about data handling.

show abstract

A Biased Review of Biases in Twitter Studies on Political Collective Action

Cited by 35 publications

References 36 publications

Understanding Popularity, Reputation, and Social Influence in the Twitter Society

Understanding Popularity, Reputation, and Social Influence in the Twitter Society

Online reactions to the 2017 ‘Unite the right’ rally in Charlottesville: measuring polarization in Twitter networks using media followership

Tampering with Twitter’s Sample API

Contact Info

Product

Resources

About