2017
DOI: 10.1093/llc/fqx023
|View full text |Cite
|
Sign up to set email alerts
|

Understanding and explaining Delta measures for authorship attribution

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
43
0
3

Year Published

2019
2019
2020
2020

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 124 publications
(52 citation statements)
references
References 12 publications
0
43
0
3
Order By: Relevance
“…The resulting clusters of individual languages are displayed in a dendrogram. We used an agglomerative clustering method very popular in authorship attribution and stylometric studies [25] based on the Delta measure [26] and Ward linkage method [27].…”
Section: Language Clusteringmentioning
confidence: 99%
“…The resulting clusters of individual languages are displayed in a dendrogram. We used an agglomerative clustering method very popular in authorship attribution and stylometric studies [25] based on the Delta measure [26] and Ward linkage method [27].…”
Section: Language Clusteringmentioning
confidence: 99%
“…Evert et al [26] point out how Burrows' Delta is a distance measure, i.e., "it describes the distance between one text and a group of texts". Hence, the smaller the Delta score, the smaller the distance between the texts, and the more similar these texts are stylistically.…”
Section: Syntactic Divergencementioning
confidence: 99%
“…Hence, the smaller the Delta score, the smaller the distance between the texts, and the more similar these texts are stylistically. We will now briefly explain how Burrows' Delta is calculated, but, for a more extensive explanation and formulae, refer to the work of Burrows himself [12] and Evert et al [26]. In short, in order to compare one text (text a) to a corpus consisting of various other texts (texts b and c), the n most frequent words from that corpus are collected.…”
Section: Syntactic Divergencementioning
confidence: 99%
“…Third, different distance-based measures have been suggested. As well-known strategies, one can mention Burrows' Delta (Burrows, 2002;Evert et al, 2017) using the top m most frequent word-tokens (with m = 40 to 1,000), the Kullback-Leibler divergence (Zhao & Zobel, 2007) using a predefined set of 363 English words, or Labbé's method (Labbé, 2014) based on the entire vocabulary and opting for a variant of the Tanimoto distance, an approach found effective for Authorship Attribution (AA; Kocher & Savoy, 2017b).…”
Section: State Of the Artmentioning
confidence: 99%