In the last 15 years, statistical physics has been a very successful framework to model complex networks. On the theoretical side, this approach has brought novel insights into a variety of physical phenomena, such as self-organisation, scale invariance, emergence of mixed distributions and ensemble non-equivalence, that display unconventional features on heterogeneous networks. At the same time, thanks to their deep connection with information theory, statistical physics and the principle of maximum entropy have led to the definition of null models for networks reproducing some features of real-world systems, but otherwise as random as possible. We review here the statistical physics approach and the various null models for complex networks, focusing in particular on the analytic frameworks reproducing the local network features. We then show how these models have been used to detect statistically significant and predictive structural patterns in real-world networks, as well as to reconstruct the network structure in case of incomplete information. We further survey the statistical physics models that reproduce more complex, semi-local network features using Markov chain Monte Carlo sampling, as well as the models of generalised network structures such as multiplex networks, interacting networks and simplicial complexes. arXiv:1810.05095v2 [physics.soc-ph]
We show that to explain the growth of the citation network by preferential attachment (PA), one has to accept that individual nodes exhibit heterogeneous fitness values that decay with time. While previous PAbased models assumed either heterogeneity or decay in isolation, we propose a simple analytically treatable model that combines these two factors. Depending on the input assumptions, the resulting degree distribution shows an exponential, log-normal or power-law decay, which makes the model an apt candidate for modeling a wide range of real systems.Over the years, models with preferential attachment (PA) were independently proposed to explain the distribution of the number of species in a genus [1], the power-law distribution of the number of citations received by scientific papers [2], and the number of links pointing to World Wide Web (WWW) pages [3]. A theoretical description of this class of processes and the observation that they generally lead to power-law distributions are due to Simon [4]. Notably, the application of PA to WWW data by Barabási and Albert helped to initiate the lively field of complex networks [5]. Their network model, which stands at the center of attention of this work, was much studied and generalized to include effects such as presence of purely random connections [6], nonlinear dependence on the degree [7], node fitness [8], and others ([9], Chap. 8).Despite its success in providing a common roof for many theoretical models and empirical data sets, preferential attachment is still little developed to take into account the temporal effects of network growth. For example, it predicts a strong relation between a node's age and its degree. While such first-mover advantage [10] plays a fundamental role for the emergence of scale-free topologies in the model, it is a rather unrealistic feature for several real systems (e.g., it is entirely absent in the WWW [11] and significant deviations are found in citation data [10,12]). This motivates us to study a model of a growing network where a broad degree distribution does not result from strong time bias in the system. To this end we assign fitness to each node and assume that this fitness decays with time-we refer it as relevance henceforth. Instead of simply classifying the vertices as active or inactive, as done in [13,14], we use real data to investigate the relevance distribution and decay therein and build a model where decaying and heterogeneous relevance are combined.Models with decaying fitness values (''aging'') were shown to produce narrow degree distributions (except for very slow decay) [15] and widely distributed fitness values were shown to produce extremely broad distributions or even a condensation phenomenon where a single node attracts a macroscopic fraction of all links [16]. We show that when these two effects act together, they produce various classes of behavior, many of which are compatible with structures observed in real data sets.Before specifying a model and attempting to solve it, we turn to data to provide sup...
We address a fundamental problem that is systematically encountered when modeling real-world complex systems of societal relevance: the limitedness of the information available. In the case of economic and financial networks, privacy issues severely limit the information that can be accessed and, as a consequence, the possibility of correctly estimating the resilience of these systems to events such as financial shocks, crises and cascade failures. Here we present an innovative method to reconstruct the structure of such partially-accessible systems, based on the knowledge of intrinsic node-specific properties and of the number of connections of only a limited subset of nodes. This information is used to calibrate an inference procedure based on fundamental concepts derived from statistical physics, which allows to generate ensembles of directed weighted networks intended to represent the real system—so that the real network properties can be estimated as their average values within the ensemble. We test the method both on synthetic and empirical networks, focusing on the properties that are commonly used to measure systemic risk. Indeed, the method shows a remarkable robustness with respect to the limitedness of the information available, thus representing a valuable tool for gaining insights on privacy-protected economic and financial systems.
Common asset holding by financial institutions (portfolio overlap) is nowadays regarded as an important channel for financial contagion with the potential to trigger fire sales and severe losses at the systemic level. We propose a method to assess the statistical significance of the overlap between heterogeneously diversified portfolios, which we use to build a validated network of financial institutions where links indicate potential contagion channels. The method is implemented on a historical database of institutional holdings ranging from 1999 to the end of 2013, but can be applied to any bipartite network. We find that the proportion of validated links (i.e. of significant overlaps) increased steadily before the 2007–2008 financial crisis and reached a maximum when the crisis occurred. We argue that the nature of this measure implies that systemic risk from fire sales liquidation was maximal at that time. After a sharp drop in 2008, systemic risk resumed its growth in 2009, with a notable acceleration in 2013. We finally show that market trends tend to be amplified in the portfolios identified by the algorithm, such that it is possible to have an informative signal about institutions that are about to suffer (enjoy) the most significant losses (gains).
The study of social, economic and biological systems is often (when not always) limited by the partial information about the structure of the underlying networks. An example of paramount importance is provided by financial systems: information on the interconnections between financial institutions is privacy-protected, dramatically reducing the possibility of correctly estimating crucial systemic properties such as the resilience to the propagation of shocks. The need to compensate for the scarcity of data, while optimally employing the available information, has led to the birth of a research field known as network reconstruction. Since the latter has benefited from the contribution of researchers working in disciplines as different as mathematics, physics and economics, the results achieved so far are still scattered across heterogeneous publications. Most importantly, a systematic comparison of the network reconstruction methods proposed up to now is currently missing. This review aims at providing a unifying framework to present all these studies, mainly focusing on their application to economic and financial networks. 7The problem of missing information. After an initial activity aimed at determining the structure of real-world networks by measuring standard topological quantities, a more theoretical activity was started, aiming at both defining new quantities and devising proper models to explain observations [45,44,52,53,54]. Given the complexity that can arise even from a simple mathematical model based upon graphs, researchers have recently focused on the development of a topological theory: loosely speaking, topological quantities are employed to define statistical models, rather than reproduced from microscopic dynamical rules [55,56,57,58,59].Unfortunately, when moving to the validation of such models a common problem arises: very often, the data available on the real network are either incomplete or imprecise (or both). This problem is particularly evident in the case of economic and financial networks: in this case, data collection suffers from the problem of partial accounting and the presence of disclosure requirements. In order to illustrate the importance of such an issue, let us think of a bipartite, financial network whose node sets represent investors and the investments they do. Although the knowledge of the whole network structure could help regulators to take immediate countermeasures to stop the propagation of financial distress, this information is seldom available (the knowledge of the whole network of investments would pose immense problems of privacy), thus hindering the possibility of providing a realistic estimate of the extent of the contagion. As confirmed by the analysis of the various papers reported in this review, the incompleteness of network instances seems to be unavoidable [60,61]: since addressing the problem of estimating the resilience of financial networks cannot be addressed without knowing the structural details of national and cross-countries interbank networks, inf...
We use citation data of scientific articles produced by individual nations in different scientific domains to determine the structure and efficiency of national research systems. We characterize the scientific fitness of each nation—that is, the competitiveness of its research system—and the complexity of each scientific domain by means of a non-linear iterative algorithm able to assess quantitatively the advantage of scientific diversification. We find that technological leading nations, beyond having the largest production of scientific papers and the largest number of citations, do not specialize in a few scientific domains. Rather, they diversify as much as possible their research system. On the other side, less developed nations are competitive only in scientific domains where also many other nations are present. Diversification thus represents the key element that correlates with scientific and technological competitiveness. A remarkable implication of this structure of the scientific competition is that the scientific domains playing the role of “markers” of national scientific competitiveness are those not necessarily of high technological requirements, but rather addressing the most “sophisticated” needs of the society.
We show that the space in which scientific, technological and economic activities interplay with each other can be mathematically shaped using techniques from statistical physics of networks. We build a holistic view of the innovation system as the tri-layered network of interactions among these many activities (scientific publication, patenting, and industrial production in different sectors), also taking into account the possible time delays. Within this construction we can identify which capabilities and prerequisites are needed to be competitive in a given activity, and even measure how much time is needed to transform, for instance, the technological know-how into economic wealth and scientific innovation, being able to make predictions with a very long time horizon. We find empirical evidence that, at the aggregate scale, technology is the best predictor for industrial and scientific production over the upcoming decades.
A problem typically encountered when studying complex systems is the limitedness of the information available on their topology, which hinders our understanding of their structure and of the dynamical processes taking place on them. A paramount example is provided by financial networks, whose data are privacy protected: Banks publicly disclose only their aggregate exposure towards other banks, keeping individual exposures towards each single bank secret. Yet, the estimation of systemic risk strongly depends on the detailed structure of the interbank network. The resulting challenge is that of using aggregate information to statistically reconstruct a network and correctly predict its higher-order properties. Standard approaches either generate unrealistically dense networks, or fail to reproduce the observed topology by assigning homogeneous link weights. Here, we develop a reconstruction method, based on statistical mechanics concepts, that makes use of the empirical link density in a highly nontrivial way. Technically, our approach consists in the preliminary estimation of node degrees from empirical node strengths and link density, followed by a maximum-entropy inference based on a combination of empirical strengths and estimated degrees. Our method is successfully tested on the international trade network and the interbank money market, and represents a valuable tool for gaining insights on privacy-protected or partially accessible systems. Reconstructing the statistical properties of a network when only partial information is available represents a key open problem in the field of statistical physics of complex systems [1,2]. Yet, addressing this issue can lead to many concrete applications. A paramount example is provided by financial networks, where nodes represent financial institutions and links stand for the various types of financial ties, such as loans or derivative contracts. These ties result in dependencies among institutions and constitute the ground for the propagation of financial distress across the network. However, due to confidentiality issues, the information that regulators are able to collect on mutual exposures is very limited [3], hindering the analysis of the system resilience to the spreading of financial distress-which depends on the structure of the whole network [4,5]. Typically, the analysis of systemic risk has been pursued by trying to estimate the unknown link weights of the network via a maximum homogeneity principle [6][7][8], looking for the adjacency matrix with minimal distance from the uniform matrix that also satisfies the imposed constraints (e.g., the budget of individual banks). These approaches are also known as dense reconstruction methods, as they assume that the network is fully connected, a hypothesis that represents their strongest limitation. In fact, not only empirical networks do show a very heterogeneous distribution of the connectivity, but such a dense reconstruction leads to systemic risk underestimation [2,8]. More refined methods such as sparse reconst...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.