This study examines the effect of political change on the use of written Scots during the eighteenth century. In particular, it compares a cross-section of texts from literate Scottish society, with works from certain politically-active authors, who identified strongly as pro- or anti-Union following the creation of the British state in 1707. The proportion of Scots to English lexemes in their writing is explored using conditional inference trees and random forests, in a small, purpose-built corpus. Use of Scots is shown to differ between the two groups, with specific extralinguistic factors encouraging or suppressing the presence of written Scots. Frequency of Scots is also found to be influenced by the political ideology of the politicised authors. These results are linked to the Scottish political scene during the eighteenth century, as well as general processes of change over time.
This paper presents the new facilities provided in defoe, a parallel toolbox for querying a wealth of digitised newspapers and books at scale. defoe has been extended to work with further Natural Language Processing () tools such as the Edinburgh Geoparser, to store the preprocessed text in several storage facilities and to support different types of queries and analyses. We have also extended the collection of XML schemas supported by defoe, increasing the versatility of the tool for the analysis of digital historical textual data at scale. Finally, we have conducted several studies in which we worked with humanities and social science researchers who posed complex and interested questions to large-scale digital collections. Results shows that defoe allows researchers to conduct their studies and obtain results faster, while all the large-scale text mining complexity is automatically handled by defoe.
This article explores the anglicisation of the Scots language between the sixteenth and eighteenth centuries, focusing on the variation between the orthographic clusters <quh-> and <wh-> found in relative and interrogative clause markers. Using modern statistical techniques, we provide the most comprehensive empirical analysis of this variation so far in the Helsinki Corpus of Older Scots (Meurman-Solin 1995). By combining the techniques of Variability-Based Neighbour Clustering (Gries & Hilpert 2008, 2010, 2012) with mixed-effects logistic regression modelling (Baayen et al.2008), we uncover a different trajectory of change than that which has previously been reported for this feature (Meurman-Solin 1993, 1997). We argue that by using modern methods of data reduction and statistical modelling, we can present a picture of language change in Scots that is more fine-grained than previous studies which use only descriptive statistics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.