2013
DOI: 10.1515/psicl-2013-0019
|View full text |Cite
|
Sign up to set email alerts
|

Noun distribution in natural languages

Abstract: Previous research on word class distribution claimed that 37% of word tokens are nouns, suggesting that there might exist a certain regularity of noun proportion among human languages. To explore this possibility, we examined the proportion of noun and four other word classes within British and American English, and across seven languages in terms of different word frequency band. Results indicated that the noun proportion is evidently about or larger than 37%, and meanwhile increases with word rarity. Among f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 20 publications
0
2
0
1
Order By: Relevance
“…It is evident that despite the existence of a large number of POS across various languages, and some disagreement about the definitions, there is strong evidence that a set of coarse POS lexical categories exists across all languages in one form or another [ 66 ]. This indicates that while fine-grained relative cross-lingual POS probabilities may vary, when linked to the same observed linguistic instantiations, the cross-lingual probabilities of coarse POS categories are likely to be similar [ 67 , 68 , 69 ]. Therefore, while we have used probabilistic POS rankings from an English language corpora in this example, since we are using only coarse-grained categories, this means that the rankings might be expected to be stable across languages.…”
Section: Example Resultsmentioning
confidence: 99%
“…It is evident that despite the existence of a large number of POS across various languages, and some disagreement about the definitions, there is strong evidence that a set of coarse POS lexical categories exists across all languages in one form or another [ 66 ]. This indicates that while fine-grained relative cross-lingual POS probabilities may vary, when linked to the same observed linguistic instantiations, the cross-lingual probabilities of coarse POS categories are likely to be similar [ 67 , 68 , 69 ]. Therefore, while we have used probabilistic POS rankings from an English language corpora in this example, since we are using only coarse-grained categories, this means that the rankings might be expected to be stable across languages.…”
Section: Example Resultsmentioning
confidence: 99%
“…Specifically, the average HR@1 values of nouns, adjectives, and adverbs are 35.1%, 27.4%, and 11.8%, respectively. Interestingly, PLMs have a high error rate when dealing with nouns even though they are trained with a large written English corpus, where nouns form the greatest portion (at least 37%) of all POS tags (Hudson, 1994;Liang and Liu, 2013).…”
Section: Plms Lack Knowledge Of Antonymsmentioning
confidence: 99%
“…Véanse, a modo de ejemplo, el trabajo deNikolaeva (2008) sobre el tungús; así comoLiang & Liu (2013) para la distribución de los nombres en las lenguas naturales.…”
unclassified