Researchers use community-detection algorithms to reveal large-scale organization in biological and social networks, but community detection is useful only if the communities are significant and not a result of noisy data. To assess the statistical significance of the network communities, or the robustness of the detected structure, one approach is to perturb the network structure by removing links and measure how much the communities change. However, perturbing sparse networks is challenging because they are inherently sensitive; they shatter easily if links are removed. Here we propose a simple method to perturb sparse networks and assess the significance of their communities. We generate resampled networks by adding extra links based on local information, then we aggregate the information from multiple resampled networks to find a coarse-grained description of significant clusters. In addition to testing our method on benchmark networks, we use our method on the sparse network of the European Court of Justice (ECJ) case law, to detect significant and insignificant areas of law. We use our significance analysis to draw a map of the ECJ case law network that reveals the relations between the areas of law.
Community detection helps us simplify the complex configuration of networks, but communities are reliable only if they are statistically significant. To detect statistically significant communities, a common approach is to resample the original network and analyze the communities. But resampling assumes independence between samples, while the components of a network are inherently dependent. Therefore, we must understand how breaking dependencies between resampled components affects the results of the significance analysis. Here we use scientific communication as a model system to analyze this effect. Our dataset includes citations among articles published in journals in the years 1984–2010. We compare parametric resampling of citations with non-parametric article resampling. While citation resampling breaks link dependencies, article resampling maintains such dependencies. We find that citation resampling underestimates the variance of link weights. Moreover, this underestimation explains most of the differences in the significance analysis of ranking and clustering. Therefore, when only link weights are available and article resampling is not an option, we suggest a simple parametric resampling scheme that generates link-weight variances close to the link-weight variances of article resampling. Nevertheless, when we highlight and summarize important structural changes in science, the more dependencies we can maintain in the resampling scheme, the earlier we can predict structural change.
To better understand the inner workings of information spreading, network researchers often use simple models to capture the spreading dynamics. But most models only highlight the effect of local interactions on the global spreading of a single information wave, and ignore the effects of interactions between multiple waves. Here we take into account the effect of multiple interacting waves by using an agent-based model in which the interaction between information waves is based on their novelty. We analyzed the global effects of such interactions and found that information that actually reaches nodes reaches them faster. This effect is caused by selection between information waves: slow waves die out and only fast waves survive. As a result, and in contrast to models with non-interacting information dynamics, the access to information decays with the distance from the source. Moreover, when we analyzed the model on various synthetic and real spatial road networks, we found that the decay rate also depends on the path redundancy and the effective dimension of the system. In general, the decay of the information wave frequency as a function of distance from the source follows a power law distribution with an exponent between -0.2 for a two-dimensional system with high path redundancy and -0.5 for a tree-like system with no path redundancy. We found that the real spatial networks provide an infrastructure for information spreading that lies in between these two extremes. Finally, to better understand the mechanics behind the scaling results, we provide analytical calculations of the scaling for a one-dimensional system.
In social systems, people communicate with each other and form groups based on their interests. The pattern of interactions, the network, and the ideas that flow on the network naturally evolve together. Researchers use simple models to capture the feedback between changing network patterns and ideas on the network, but little is understood about the role of past events in the feedback process. Here we introduce a simple agent-based model to study the coupling between peoples' ideas and social networks, and better understand the role of history in dynamic social networks. We measure how information about ideas can be recovered from information about network structure and, the other way around, how information about network structure can be recovered from information about ideas. We find that it is in general easier to recover ideas from the network structure than vice versa.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.