2007
DOI: 10.1177/0265532207080767
|View full text |Cite
|
Sign up to set email alerts
|

vocd: A theoretical and empirical evaluation

Abstract: A reliable index of lexical diversity (LD) has remained stubbornly elusive for over 60 years. Meanwhile, researchers in fields as varied as stylistics, neuropathology, language acquisition, and even forensics continue to use flawed LD indices — often ignorant that their results are questionable and in some cases potentially dangerous. Recently, an LD measurement instrument known as vocd has become the virtual tool of the LD trade. In this paper, we report both theoretical and empirical evidence that calls into… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
122
2
5

Year Published

2008
2008
2020
2020

Publication Types

Select...
4
2
2

Relationship

2
6

Authors

Journals

citations
Cited by 209 publications
(131 citation statements)
references
References 32 publications
(39 reference statements)
2
122
2
5
Order By: Relevance
“…It is defined as the number of unique phrases appearing in titles of a statistically large unit quota of literature (thousands of articles). It is similar to the lexical diversity used in computational linguistics to study richness of verbal expression (McKee et al, 2000;McCarthy and Jarvis, 2007;Koizumi, 2012), but uses phrases (concepts) instead of individual words. Bodies of literature that have more diverse concepts in titles will have a higher fraction of unique phrases and could be considered to cover larger cognitive extents.…”
Section: Cognitive Extentmentioning
confidence: 99%
“…It is defined as the number of unique phrases appearing in titles of a statistically large unit quota of literature (thousands of articles). It is similar to the lexical diversity used in computational linguistics to study richness of verbal expression (McKee et al, 2000;McCarthy and Jarvis, 2007;Koizumi, 2012), but uses phrases (concepts) instead of individual words. Bodies of literature that have more diverse concepts in titles will have a higher fraction of unique phrases and could be considered to cover larger cognitive extents.…”
Section: Cognitive Extentmentioning
confidence: 99%
“…But McCarthy and Jarvis (2007) demonstrated that the vocd-D value is actually based on probabilities of word occurrence and that D serves no purposes except to convert the LD value into a new scale. More specifically, McCarthy and Jarvis demonstrated that vocd-D is merely a complex way of approximating the hypergeometric distribution, and to demonstrate this, they described an index that we refer to here as HD-D.…”
Section: Hd-dmentioning
confidence: 99%
“…For this reason, it is not surprising that McCarthy and Jarvis (2007) found correlations of r .971 between vocd-D and HD-D (i.e., sums of probabilities) for sample sizes from 35 to 50 (i.e., the sizes of samples that vocd-D uses in its random-sampling procedures). The correlation would have been perfect had it not been for the slight imprecision in vocd-D's output brought about by its reliance strung together.…”
Section: Hd-dmentioning
confidence: 99%
See 1 more Smart Citation
“…We compare different operationalisations of the AG with a well-known measure of lexical diversity, D (Malvern et al, 2004), which represents the single parameter of a mathematical function that models the falling TTR curve (see also Jarvis, 2002 andMcCarthy andJarvis, 2007 for an appraisal of this measure). The different measures can give us an indication to what extent the students from the three groups differ from each other in the quantity and/or in the quality of the vocabulary they use.…”
Section: M Easuri Ng Le X I Cal S Oph I St I Cat I Onmentioning
confidence: 99%