Word Length and Word Frequency

Strauss, Udo; Grzybek, Peter; Altmann, Gabriel

doi:10.1007/1-4020-4068-7_13

Cited by 30 publications

(39 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Lexical sophistication was estimated by the average number of letters in each word (Verspoor, Schmid, & Xu 2012;Verspoor & van Dijk, 2011). Word length and word frequency are associated (Strauss, Grzybek, & Altmann, 2007); longer words are therefore considered to be more complex and sophisticated (Wolfe-Quintero, Inagaki, & Kim, 1998), and their use has repeatedly been shown to be associated with higher language proficiency (Grant & Ginther, 2000). The advantage of using word length over word frequency analyses is that the latter (e.g., Laufer & Nation, 1995) relies on general frequency lists derived from written texts that might not be appropriate for the language investigated here, that is, speech directed toward child bilingual learners.…”

Section: Transcribing and Coding The Sbr Sessionsmentioning

confidence: 99%

Instructional strategies and linguistic features of kindergarten teachers’ shared book reading: The case of Singapore

Sun

Toh

Steinkrauss

2020

Applied Psycholinguistics

View full text Add to dashboard Cite

Teachers’ language practice during shared book reading may significantly affect the rate and outcome of early language proficiency. The current study has focused on 37 kindergarten teachers and 440 4- to 5-year-old kindergartners during their shared book reading sessions in Singapore, exploring teachers’ variation in instructional strategies and linguistic features, and its relations with children’s language development and teacher’s background. Results demonstrated that teacher’s language strategies and linguistic features varied considerably. Instructional strategies with a medium level of cognitive load were found to be positively related to children’s growth in receptive vocabulary and word reading skills. Teacher’s lexical sophistication was found to be positively associated with children’s vocabulary size. Years of teaching experience was revealed to predict teacher’s variation in medium-level instructions.

show abstract

Section: Transcribing and Coding The Sbr Sessionsmentioning

confidence: 99%

Instructional strategies and linguistic features of kindergarten teachers’ shared book reading: The case of Singapore

Sun

Toh

Steinkrauss

2020

Applied Psycholinguistics

View full text Add to dashboard Cite

show abstract

“…Here, we show that human lexical systems are such codes, with word length primarily determined by the average amount of information a word conveys in context. The exact forms of the frequency-length relationship (3,4) and the distribution of word lengths (5) have been quantitatively evaluated previously. In contrast, information content offers an empirically supported and rationally motived alternative to Zipf's frequency-length relationship.…”

mentioning

confidence: 99%

Word lengths are optimized for efficient communication

Piantadosi

Tily

Gibson

2011

Proc. Natl. Acad. Sci. U.S.A.

424

425

View full text Add to dashboard Cite

We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than frequency. This indicates that human lexicons are efficiently structured for communication by taking into account interword statistical dependencies. Lexical systems result from an optimization of communicative pressures, coding meanings efficiently given the complex statistics of natural language use.information theory | rational analysis O ne widely known and apparently universal property of human language is that frequent words tend to be short. This law was popularized by Harvard linguist George Kingsley Zipf, who observed that "the magnitude of words tends, on the whole, to stand in an inverse (not necessarily proportionate) relationship to the number of occurrences" (1).Zipf theorized that this pattern resulted from a pressure for communicative efficiency. Information can be conveyed as concisely as possible by giving the most frequently used meanings the shortest word forms, much like in variable-length (e.g., Huffman) codes. This strategy provided one key exemplar of Zipf's principle of least effort, a grand "principle that governs our entire individual and collective behavior of all sorts, including the behavior of our language" (2). Zipf's idea of assigning word length by frequency can be maximally concise and efficient if words occur independently from a stationary distribution. However, natural language use is highly nonstationary as word probabilities change depending on their context. A more efficient code for meanings can therefore be constructed by respecting the statistical dependencies between words. Here, we show that human lexical systems are such codes, with word length primarily determined by the average amount of information a word conveys in context. The exact forms of the frequency-length relationship (3, 4) and the distribution of word lengths (5) have been quantitatively evaluated previously. In contrast, information content offers an empirically supported and rationally motived alternative to Zipf's frequency-length relationship.A lexicon that assigns word lengths based on information content differs from Zipf's theory in two key ways. First, such a lexicon would not be the most concise one possible as it would not shorten highly informative words, even if shorter distinctive wordforms were available. Second, unlike Zipf's system, assigning word length based on information content keeps the information rate of communication as constant as possible (6). A tendency for this type of "smoothing out" peaks and dips of informativeness is known as uniform information density and has been observed in choices made during online language production (7-10). Formally, uniform information density holds that language users make choices that keep the...

show abstract

“…Numerous authors, even among proponents of alternative formulas, have concluded that vocabulary frequency and sentence length are the best indicators of text difficulty (see DuBay, 2004). The Flesch-Kincaid formula, included in MS Word, is based on sentence and word length, given that word length strongly correlates with word frequency with r values of up to 0.997 (Strauss, Grzybek, & Altmann, 2007).…”

Section: Text Selection Text Selectionmentioning

confidence: 99%

Psychometric validation of the Sentence Verification Technique to assess L2 reading comprehension ability

Pichette¹,

Béland²,

Serres³

et al. 2014

TQMP

View full text Add to dashboard Cite

Abstract Abstract English teachers use the Sentence Verification Technique (Royer et al., 1979) to determine the readability of written material for their classes. This process requires students to read short passages from a book, followed by isolated sentences. These sentences can be either identical or different from the original passages, in their meaning as well as in their form. For each sentence, students must indicate whether or not its content corresponds to that of the original passage. This paper reports on the design and assessment of an SVT test created for measuring reading comprehension ability, based on four English texts. The instrument was administered to 171 adult English learners, of various levels of English proficiency. The data were analyzed using both traditional psychometric methods and the Rasch model. Results indicate that the test shows high internal consistency, that it respects the basic assumptions behind the Rasch model, and that it is in the recommended range of difficulty for that technique.

show abstract

Word Length and Word Frequency

Cited by 30 publications

References 0 publications

Instructional strategies and linguistic features of kindergarten teachers’ shared book reading: The case of Singapore

Instructional strategies and linguistic features of kindergarten teachers’ shared book reading: The case of Singapore

Word lengths are optimized for efficient communication

Psychometric validation of the Sentence Verification Technique to assess L2 reading comprehension ability

Contact Info

Product

Resources

About