Estimating the Shannon entropy of a discrete distribution from which we have only observed a small sample is challenging. Estimating other information-theoretic metrics, such as the Kullback-Leibler divergence between two sparsely sampled discrete distributions, is even harder. Existing approaches to address these problems have shortcomings: they are biased, heuristic, work only for some distributions, and/or cannot be applied to all information-theoretic metrics. Here, we propose a fast, semi-analytical estimator for sparsely sampled distributions that is efficient, precise, and general. Its derivation is grounded in probabilistic considerations and uses a hierarchical Bayesian approach to extract as much information as possible from the few observations available. Our approach provides estimates of the Shannon entropy with precision at least comparable to the state of the art, and most often better. It can also be used to obtain accurate estimates of any other information-theoretic metric, including the notoriously challenging Kullback-Leibler divergence. Here, again, our approach performs consistently better than existing estimators.
Terrestrial Transportation Infrastructures (TTIs) are shaped by both socio-political and geographical factors, hence encoding crucial information about how resources and power are distributed through a territory. Therefore, analysing the structure of pathway, railway or road networks allows us to gain a better understanding of the political and social organization of the communities that created and maintained them. Network science can provide extremely useful tools to address quantitatively this issue. Here, focussing on passengers transport, we propose a methodology to shed light on the processes and forces that moulded transportation infrastructures into their current configuration, without having to rely on any additional information besides the topology of the network and the distribution of the population. Our approach is based on a simple mechanistic model that implements a wide spectrum of decision-making mechanisms (representing different power distributions) which could have driven the growth of a TTI. Thus, by adjusting a few model parameters, it is possible to generate several synthetic transportation networks, and compare across them and against the empirical system under study. An illustrative case study (i.e. the railway system in Catalonia, a region in Spain) is also provided to showcase the application of the proposed methodology. Our preliminary results highlight the potential of our approach, thus calling for further research.
Laws and legal decision-making regulate how societies function. Therefore, they evolve and adapt to new social paradigms and reflect changes in culture and social norms, and are a good proxy for the evolution of socially sensitive issues. Here, we use an information-theoretic methodology to quantitatively track trends and shifts in the evolution of large corpora of judicial decisions, and thus to detect periods in which disruptive topics arise. When applied to a large database containing the full text of over 100,000 judicial decisions from Spanish courts, we are able to identify an abrupt change in housing-related decisions around 2016. Because our information-theoretic approach pinpoints the specific content that drives change, we are also able to interpret the results in terms of the role played by legislative changes, landmark decisions, and the influence of social movements.
The structure and evolution of Terrestrial Transportation Infrastructures (TTIs) are shaped by both socio-political and geographical factors, hence encoding crucial information about how resources and power are distributed through a territory. Therefore, analysing pathway, railway or road networks allows us to gain a better understanding of the political and social organization of the communities that created and maintained them. Network science can provide extremely useful tools to address quantitatively this issue. Here, focusing on passengers transport, we propose a methodology for mapping a TTI into a formal network object able to capture both the spatial distribution of the population and the connections provided by the considered mean of transport. Secondly, we present a simple mechanistic model that implements a wide spectrum of decisionmaking mechanisms which could have driven the creation of links (connections). Thus, by adjusting few parameters, for any empirical system, it is possible to generate a synthetic counterpart such that their differences are minimized. By means of such inverse engineering approach, we are able to shed some light on the processes and forces that moulded transportation infrastructures into their current configuration, without having to rely on any additional information besides the topology of the network and the distribution of the population. An illustrative example is also provided to showcase the applications of the proposed methodology and discuss how our conclusions fit with previously acquired knowledge (and literature) on the topic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.