A Google Trends spatial clustering approach for a worldwide Twitter user geolocation

Zola, Paola; Ragno, Costantino; Cortez, Paulo

doi:10.1016/j.ipm.2020.102312

Cited by 27 publications

(11 citation statements)

References 32 publications

(56 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As for practical implications, we recommend performing the spatial characterization as per city basis, this is mostly due to how the density based clusters vary according to the observed region and the computational power available. Moreover, in similar cases to Twitter, where none or only a small portion of messages carry the location context, incorporating other solutions to predict location context from messages lacking this information will significantly increase the observation surface [18]. Furthermore, practitioners might want to tune or extend upon our classification results, particularly for the classes with lower F1-scores.…”

Section: A Research Discussionmentioning

confidence: 99%

“…We considered this to be of interest because marketing campaigns can be costly and may suffer from low user response compared to the original investment. Moreover, there might be other reasons to estimate the optimal location for a particular activity [17], such as opening a new store [6], [7], or estimating the interest for something in particular locations to validate assumptions [18]. Our work focuses on how to characterize geographic areas in relation to a selected set of product categories.…”

Section: Research Objective and Contribution Overviewmentioning

confidence: 99%

“…In particular, the lack of dynamic and geo tagged opinion data. However, in the case of Twitter, other researches have proposed approaches to infer the locations of messages when lacking this information [18], [53]. Furthermore, to the best of our knowledge, there were few public geo tagged free text datasets that we could exploit.…”

Section: B Limitationsmentioning

confidence: 99%

See 2 more Smart Citations

Geo-Spatial Market Segmentation & Characterization Exploiting User Generated Text Through Transformers & Density-Based Clustering

et al. 2021

View full text Add to dashboard Cite

Section: A Research Discussionmentioning

confidence: 99%

Section: Research Objective and Contribution Overviewmentioning

confidence: 99%

See 1 more Smart Citation

Geo-Spatial Market Segmentation & Characterization Exploiting User Generated Text Through Transformers & Density-Based Clustering

et al. 2021

View full text Add to dashboard Cite

“…The most used algorithms are distance-based solutions such as K-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), or Ordering Points To Identify the Clustering Structure (OPTICS), and probabilistic ones like Probabilistic Latent Semantic Indexing (PLSI) [20]. The work of Zola et al [40] is a good example of how some of these clustering techniques are currently used in the spatial data context to identify patterns in text collections. They estimate Twitter user location based on their tweets using Google Trends frequencies of tweet nouns and clustering to identify the most probable location.…”

Section: Related Workmentioning

confidence: 99%

Approaches for the Clustering of Geographic Metadata and the Automatic Detection of Quasi-Spatial Dataset Series

Lacasta

López-Pellicer

Zarazaga‐Soria

et al. 2022

IJGI

View full text Add to dashboard Cite

The discrete representation of resources in geospatial catalogues affects their information retrieval performance. The performance could be improved by using automatically generated clusters of related resources, which we name quasi-spatial dataset series. This work evaluates whether a clustering process can create quasi-spatial dataset series using only textual information from metadata elements. We assess the combination of different kinds of text cleaning approaches, word and sentence-embeddings representations (Word2Vec, GloVe, FastText, ELMo, Sentence BERT, and Universal Sentence Encoder), and clustering techniques (K-Means, DBSCAN, OPTICS, and agglomerative clustering) for the task. The results demonstrate that combining word-embeddings representations with an agglomerative-based clustering creates better quasi-spatial dataset series than the other approaches. In addition, we have found that the ELMo representation with agglomerative clustering produces good results without any preprocessing step for text cleaning.

show abstract

“…Diversos estudios estiman fiabilidades en función del lugar del tuit por encima del 90% en Reino Unido o Estados Unidos, 85,8% en España o 83,95% Filipinas, con una media mundial del 77,84% y 88,15% en Europa (van der Veen et al, 2015), o un error medio de localización de 256 kilómetros (Holbrook et al, 2016). Dichos porcentajes pueden ser mejorados mediante el uso de técnicas complementarias varias de las que hay diversa bibliografía, que no han sido aplicadas para este trabajo (Zola, Ragno y Cortez, 2020).…”

Section: Metodologíaunclassified

Polarización en Twitter durante la crisis de la COVID-19: Caso Aislado y Periodista Digital

García

Márquez

Gascón

2021

RCom

View full text Add to dashboard Cite

La proclamación del Estado de Alarma en España en marzo de 2020 trajo consigo un periodo de gran intensidad informativa en medios tradicionales y digitales. Lo extraordinario de la medida, que dotaba de medidas excepcionales al Ejecutivo para hacer frente a la pandemia de Covid-19, dio lugar a un escenario tremendamente polarizado. En este contexto, diversos portales conocidos por la difusión de campañas de desinformación e, incluso, promoción de ideas simpatizantes con la extrema derecha, fueron especialmente activos en redes promoviendo la difusión de contenido ideológico con el objetivo de captar tráfico para su posterior monetización mediante publicidad. Este trabajo hace el seguimiento de la actividad alrededor de dos portales en Twitter, Caso Aislado y Periodista Digital, con la intención de arrojar luz sobre su papel en el clima de polarización política. Durante más de dos meses, se captaron, almacenaron y estudiaron más de 100.000 tweets mediante el software R y diversos algoritmos para dilucidar la actividad social, la posible existencia o no de bots o perfiles automatizados, la naturaleza del contenido vertido y la carga emocional asociada a él. Se comprueba una intensa actividad organizada alrededor de ambos portales a través de un alto porcentaje de cuentas aparentemente automatizadas y el apoyo de perfiles influencers que ejercen como redifusores de alta potencia. Aunque con diferencias propias de cada medio, es posible entrever una coordinación intencionada a través de campañas que aúnan contenidos, uso de cuentas de apoyo y automatizaciones.

show abstract

A Google Trends spatial clustering approach for a worldwide Twitter user geolocation

Cited by 27 publications

References 32 publications

Geo-Spatial Market Segmentation & Characterization Exploiting User Generated Text Through Transformers & Density-Based Clustering

Geo-Spatial Market Segmentation & Characterization Exploiting User Generated Text Through Transformers & Density-Based Clustering

Approaches for the Clustering of Geographic Metadata and the Automatic Detection of Quasi-Spatial Dataset Series

Polarización en Twitter durante la crisis de la COVID-19: Caso Aislado y Periodista Digital

Contact Info

Product

Resources

About