2017
DOI: 10.3390/e19120686
|View full text |Cite
|
Sign up to set email alerts
|

Do We Really Need to Catch Them All? A New User-Guided Social Media Crawling Method

Abstract: Abstract:With the growing use of popular social media services like Facebook and Twitter it is challenging to collect all content from the networks without access to the core infrastructure or paying for it. Thus, if all content cannot be collected one must consider which data are of most importance. In this work we present a novel User-guided Social Media Crawling method (USMC) that is able to collect data from social media, utilizing the wisdom of the crowd to decide the order in which user generated content… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 27 publications
0
2
0
Order By: Relevance
“…In this way, the crawling process could crawl most of the newly produced content with limited resources (and taking into account the access restrictions of the SM source). In [25], a user-guided Social Media crawling method was proposed. The goal was not to crawl the entire SM platform (or extract the full set of users) but instead to obtain a sample of posts or submissions that are statistically representative of the entire dataset.…”
Section: A Real-time Crawling Of Social Mediamentioning
confidence: 99%
“…In this way, the crawling process could crawl most of the newly produced content with limited resources (and taking into account the access restrictions of the SM source). In [25], a user-guided Social Media crawling method was proposed. The goal was not to crawl the entire SM platform (or extract the full set of users) but instead to obtain a sample of posts or submissions that are statistically representative of the entire dataset.…”
Section: A Real-time Crawling Of Social Mediamentioning
confidence: 99%
“…al. [6] and is publicly available at Harvard Dataverse [5]. The data from these pages were parsed and for each post the corresponding likes and comments were extracted.…”
Section: Dataset and Network Modelmentioning
confidence: 99%