Literature is a form of expression whose temporal structure, both in content and style, provides a historical record of the evolution of culture. In this work we take on a quantitative analysis of literary style and conduct the first large-scale temporal stylometric study of literature by using the vast holdings in the Project Gutenberg Digital Library corpus. We find temporal stylistic localization among authors through the analysis of the similarity structure in feature vectors derived from content-free word usage, nonhomogeneous decay rates of stylistic influence, and an accelerating rate of decay of influence among modern authors. Within a given time period we also find evidence for stylistic coherence with a given literary topic, such that writers in different fields adopt different literary styles. This study gives quantitative support to the notion of a literary "style of a time" with a strong trend toward increasingly contemporaneous stylistic influence.cultural evolution | stylometry | culture | complexity | big data W ritten works, or literature, provide one of the great bodies of cultural artifacts. The analysis of literature typically involves the aggregation of information on several levels, ranging from words to sentences and even larger scale properties of temporal narratives such as structure, plot, and the use of irony and metaphor (1-3). Quantitative methods have long been applied to literature, most notably in the analysis of style, which can be traced back to a comment by the mathematician Augustus de Morgan regarding the attribution of the Pauline epistles (4) and the late nineteenth century work of the historian of philosophy Wincenty Lutasłowski, who brought basic statistical ideas of word usage to the problem of dating the dialogues of Plato (5). It was Lutasłowski who coined the word "stylometry" to describe such an approach to investigating questions of literary style. Since then, a wide range of statistical techniques have been developed toward this end (6), generally with the goal of settling questions of author attribution (see, e.g., refs. 6-11). Stylometric studies have also been pursued in the study of visual art (12, 13) and music [both in composition (14-16) and performance (17)], and are part of a growing body of work in the quantitative analysis of cultural artifacts (18).In this paper we report our findings from the first large-scale stylometric analysis of literature. The goal of this work is not author attribution-for the authorship of all the works is well known-but is instead to articulate, in a quantitative fashion, large-scale temporal trends in literary (i.e., writing) style. This type of study has been, until now, impossible to undertake, but the advent of mass digitization has created dramatic new opportunities for scholarly studies in literature as well as in other disciplines (19). Our literature sample is obtained from the Project Gutenberg Digital Library (http://www.gutenberg.org/wiki/ Gutenberg:About). Project Gutenberg consists of more than 30,000 public domai...
Many real-world networks tend to be very dense. Particular examples of interest arise in the construction of networks that represent pairwise similarities between objects. In these cases, the networks under consideration are weighted, generally with positive weights between any two nodes. Visualization and analysis of such networks, especially when the number of nodes is large, can pose significant challenges which are often met by reducing the edge set. Any effective “sparsification” must retain and reflect the important structure in the network. A common method is to simply apply a hard threshold, keeping only those edges whose weight exceeds some predetermined value. A more principled approach is to extract the multiscale “backbone” of a network by retaining statistically significant edges through hypothesis testing on a specific null model, or by appropriately transforming the original weight matrix before applying some sort of threshold. Unfortunately, approaches such as these can fail to capture multiscale structure in which there can be small but locally statistically significant similarity between nodes. In this paper, we introduce a new method for backbone extraction that does not rely on any particular null model, but instead uses the empirical distribution of similarity weight to determine and then retain statistically significant edges. We show that our method adapts to the heterogeneity of local edge weight distributions in several paradigmatic real world networks, and in doing so retains their multiscale structure with relatively insignificant additional computational costs. We anticipate that this simple approach will be of great use in the analysis of massive, highly connected weighted networks.
Abstract. The World Trade Web (WTW) is a weighted network whose nodes correspond to countries with edge weights reflecting the value of imports and/or exports between countries. In this paper we introduce to this macroeconomic system the notion of extinction analysis, a technique often used in the analysis of ecosystems, for the purposes of investigating the robustness of this network. In particular, we subject the WTW to a principled set of in silico "knockout experiments," akin to those carried out in the investigation of food webs, but suitably adapted to this macroeconomic network. Broadly, our experiments show that over time the WTW moves to a "robust yet fragile" configuration where it is robust to random failures but fragile under targeted attack. This change in stability is highly correlated with the connectance (edge density) of the network. Moreover, there is evidence of a sharp change in the structure of the network in the 1960s and 1970s, where most measures of robustness rapidly increase before resuming a declining trend. We interpret these results in the context in the post-World War II move towards globalization. Globalization coincides with the sharp increase in robustness but also with a rise in those measures (e.g., connectance and trade imbalances) which correlate with decreases in robustness. The peak of robustness is reached after the onset of globalization policy but before the negative impacts are substantial. These analyses depend on a simple model of dynamics that rebalances the trade flow upon network perturbation, the most dramatic of which is node deletion. More subtle and textured forms of perturbation lead to the definition of other measures of node importance as well as vulnerability. We anticipate that experiments and measures like these can play an important role in the evaluation of the stability of economic systems.
Determining the impact of non-pharmaceutical interventions on transmission of the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is paramount for the design and deployment of effective public health policies. Incorporating Apple Maps mobility data into an epidemiological model of daily deaths and hospitalizations allowed us to estimate an explicit relationship between human mobility and transmission in the United States. We find that reduced mobility explains a large decrease in the effective reproductive number (R E ) attained by April 1st and further identify state-to-state variation 1 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.