Corpus linguistics is 'the study of language based on examples of real life language use' (McEnery and Wilson 1996: 1), with the examples collected, stored and analysed as a corpus (pl. corpora). Corpora can run into millions or even billions of words, and therefore require the use of specialised software to quantitatively and qualitatively analyse them. Corpus linguistics is a set of methods and procedures that can be applied in the analysis of a range of texts and contexts that forensic linguists may be interested in examining.Since the advent of modern-day corpus linguistics, many fields have benefitted from its ability to identify patterns in text, add evidence to support qualitative analyses and explore large datasets in ways not previously possible. However, uptake in forensic linguistics has been relatively slow. This is likely due to a number of factors, not least the fact that the types of data that forensic linguists work with are often not in abundance. Whether it is courtroom or police interview transcripts, or evidential texts such as text messages, emails, letters or threats, the data (at least in most parts of the world) is scarce, and many researchers spend years to source and collect precious datasets, often after developing close working relationships with organisations or individuals who have access to data that are otherwise in short supply. Some of the earliest and most seminal work in forensic linguistics is corpus-based. In the work which coined the term 'forensic linguistics', Svartvik (1968) used a corpus approach to analyse a set of disputed witness statements in a murder case. Similarly, in his analysis of the Derek Bentley statement, a watershed case for forensic linguistics, Coulthard (1994) used specialised corpora of ordinary witness statements and police statements, along with the much larger spoken element of the COBUILD corpus, to question the authorship of Bentley's disputed statement. With a few notable exceptions, including early adopters of corpus techniques such as Kredens (2002) in authorship analysis and Cotterill (2003) andHeffer (2005) in courtroom discourse analysis, there was relatively little corpus linguistic work in forensic linguistics in the twenty years since Coulthard (1994). The second decade of the twenty-first century, however, has seen a healthy increase in the amount of corpus-based