2021
DOI: 10.1371/journal.pone.0253461
|View full text |Cite
|
Sign up to set email alerts
|

Upscaling human activity data: A statistical ecology approach

Abstract: Big data require new techniques to handle the information they come with. Here we consider four datasets (email communication, Twitter posts, Wikipedia articles and Gutenberg books) and propose a novel statistical framework to predict global statistics from random samples. More precisely, we infer the number of senders, hashtags and words of the whole dataset and how their abundances (i.e. the popularity of a hashtag) change through scales from a small sample of sent emails per sender, posts per hashtag and wo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 47 publications
0
1
0
Order By: Relevance
“…In community ecology, species abundance distribution (SAD)the distribution of individuals within a species in a given communityhas been a cornerstone of research (11,14), which can potentially shed light on the study of plankton microbial communities and their structure. Studying SADs not only allows for a characterization of ecological communities but also provides critical insights that enable species number estimations at larger scales beyond direct measurement, accomplished by inferring SAD distribution forms (15)(16)(17)(18)(19). Notably, the functional form of SADs has shown consistency across diverse ecosystems.…”
Section: Introductionmentioning
confidence: 99%
“…In community ecology, species abundance distribution (SAD)the distribution of individuals within a species in a given communityhas been a cornerstone of research (11,14), which can potentially shed light on the study of plankton microbial communities and their structure. Studying SADs not only allows for a characterization of ecological communities but also provides critical insights that enable species number estimations at larger scales beyond direct measurement, accomplished by inferring SAD distribution forms (15)(16)(17)(18)(19). Notably, the functional form of SADs has shown consistency across diverse ecosystems.…”
Section: Introductionmentioning
confidence: 99%