2015
DOI: 10.1515/9783110363814
|View full text |Cite
|
Sign up to set email alerts
|

Cluster Analysis for Corpus Linguistics

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0
3

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(30 citation statements)
references
References 0 publications
0
22
0
3
Order By: Relevance
“…This raises the question of how the researcher can know in advance how many groups are optimal. The obvious solution is to base the initial decision on a priori knowledge (Moisl, ). As described above, previous analyses of the Dutch data tentatively identified three groups: a large ‘instrumental’ group and two smaller ‘anglophile’ and ‘anti‐English’ groups (author).…”
Section: Methodsmentioning
confidence: 99%
“…This raises the question of how the researcher can know in advance how many groups are optimal. The obvious solution is to base the initial decision on a priori knowledge (Moisl, ). As described above, previous analyses of the Dutch data tentatively identified three groups: a large ‘instrumental’ group and two smaller ‘anglophile’ and ‘anti‐English’ groups (author).…”
Section: Methodsmentioning
confidence: 99%
“…The complex set of individual frequency differences is transformed into a compact measure of similarity between the samples. There are many possible mathematical transformations that can be used as distance measures (Moisl, 2014); stylometry uses a dozen or so of them.…”
Section: Multidimensional Methodsmentioning
confidence: 99%
“…Depending on the amount of curvature, the difference between the two measures can be significant, and can therefore significantly affect analysis based on it. Given the difficulty of determining the presence of non-linearity in high-dimensional data and given also the implications of non-linearity for the present analysis cannot be ignored, and because hierarchical methods are linear, the additional method or methods used must be non-linear to take account of non-linearity and accommodate the possibility that the DFW, Dbigram, and Dtrigram contain significant non-linearities [33]. (2) It is recognized that a single class of methods cannot safely be relied on [34,54], and that at least one additional method or class of methods must be used to corroborate the results from hierarchical analysis.…”
Section: Cluster Analysis Methodsmentioning
confidence: 99%
“…Cluster analysis aims to detect and graphically to reveal structures or patterns in the distribution of data items, variables, or texts, in n-dimensional space, where n is the number of variables used to describe an author's style. There is a large number of cluster analysis methods and a large literature associated with each [33,34]. Apart from a few attempts using hierarchical cluster analysis methods and principal components analysis with authorship attribution [32,[35][36][37][38][39], to the best of my knowledge, until recently, little work has been done using cluster analytical methods with authorship attribution problems.…”
Section: Stylometrymentioning
confidence: 99%
See 1 more Smart Citation