The jury trial is a critical point where the state and its citizens come together to define the limits of acceptable behavior. Here we present a large-scale quantitative analysis of trial transcripts from the Old Bailey that reveal a major transition in the nature of this defining moment. By coarse-graining the spoken word testimony into synonym sets and dividing the trials based on indictment, we demonstrate the emergence of semantically distinct violent and nonviolent trial genres. We show that although in the late 18th century the semantic content of trials for violent offenses is functionally indistinguishable from that for nonviolent ones, a longterm, secular trend drives the system toward increasingly clear distinctions between violent and nonviolent acts. We separate this process into the shifting patterns that drive it, determine the relative effects of bureaucratic change and broader cultural shifts, and identify the synonym sets most responsible for the eventual genre distinguishability. This work provides a new window onto the cultural and institutional changes that accompany the monopolization of violence by the state, described in qualitative historical analysis as the civilizing process. Part of a formal theory of cultural development designed to explain the emergence of the modern Western state, this civilizing process is taken to include a wide variety of forms of interpersonal relationships ranging from the rise of the concept of politeness to the relationships between classes. The core claim of the theory is that the state effectively monopolized the use of violence over the course of the 16th to 20th centuries, becoming an important actor in both the control of the cultures that encouraged violence, and in the direct policing and control of violence itself.The bureaucracies that characterize this shift undertook information gathering on an unprecedented scale, designed in part to inform later decision making, and the digitization of these records makes it newly possible to study the civilizing process in a quantitative fashion. The data here come from the detailed records of the Central Criminal Court, or Old Bailey, in London (11, 12). The Old Bailey has heard trials for serious crimes in London and the surrounding counties since the 16th century, and forms one of the longest-running bureaucracies in the modern Western world.We analyze the 112,485 trial records, encompassing more than 20 million (semantic) words of testimony recorded between 1760 and 1913, a period during which trial reports were at their most comprehensive. We focus on the lexical semantics of spoken testimony: the meaning-laden words used by speakers that can be grouped as synonyms at different thresholds of similarity. Our methods allow us to study the explicitly named semantic structures of these texts over more than 2 orders of magnitude in resolution, from the word-stem level (2.6 × 10 4 categories) to a synonym set level, with 1,040 categories, to a highly coarsegrained representation with only 116 categories.We report...
We characterize the statistical bootstrap for the estimation of informationtheoretic quantities from data, with particular reference to its use in the study of large-scale social phenomena. Our methods allow one to preserve, approximately, the underlying axiomatic relationships of information theory-in particular, consistency under arbitrary coarse-graining-that motivate use of these quantities in the first place, while providing reliability comparable to the state of the art for Bayesian estimators. We show how information-theoretic quantities allow for rigorous empirical study of the decision-making capacities of rational agents, and the time-asymmetric flows of information in distributed systems. We provide illustrative examples by reference to ongoing collaborative work on the semantic structure of the British Criminal Court system and the conflict dynamics of the contemporary Afghanistan insurgency.
This research proposes and evaluates a linguistically motivated approach to extracting temporal structure from text. Pairs of events in a verb-clause construction were considered, where the first event is a verb and the second event is the head of a clausal argument to that verb. All pairs of events in the TimeBank that participated in verb-clause constructions were selected and annotated with the labels BEFORE, OVERLAP and AFTER. The resulting corpus of 895 event-event temporal relations was then used to train a machine learning model. Using a combination of event-level features like tense and aspect with syntax-level features like the paths through the syntactic tree, support vector machine (SVM) models were trained which could identify new temporal relations with 89.2% accuracy. High accuracy models like these are a first step towards automatic extraction of temporal structure from text.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.