In this article concentration (i.e., inequality) aspects of the functions of Zipf and of Lotka are studied. Since both functions are power laws (i.e., they are mathematically the same) it suffices to develop one concentration theory for power laws and apply it twice for the different interpretations of the laws of Zipf and Lotka. After a brief repetition of the functional relationships between Zipf's law and Lotka's law, we prove that Price's law of concentration is equivalent with Zipf's law. A major part of this article is devoted to the development of continuous concentration theory, based on Lorenz curves. The Lorenz curve for power functions is calculated and, based on this, some important concentration measures such as the ones of Gini, Theil, and the variation coefficient. Using Lorenz curves, it is shown that the concentration of a power law increases with its exponent and this result is interpreted in terms of the functions of Zipf and Lotka.
IntroductionThe historical law of Lotka (1926) is the basis of modern informetrics and is expressed as follows: the number of authors with n (n ϭ 1,2,3, . . .) publications is proportional to where a Ͼ 0. In other words, there is a constant C Ͼ 0 such that (1) where f(n) denotes the number of authors with n publications. More generally, f(n) can denote the number of sources (authors, journals, word types...) with n items (publications, articles, word occurrences, respectively); henceforth, this dual source/item terminology will be used (see also Egghe, 1989Egghe, , 1990Egghe and Rousseau, 1990a).A totally different informetric formulation, originating from linguistics (in terms of word types and word occurrences) is given by the Zipf law: If we rank the sourcesaccording to their number of items (starting with the source with the highest number of items, hence giving to this source the rank r ϭ 1) then the number of items in the source on rank r (r ϭ 1, 2, 3, . . .) is proportional to where b Ͼ 0. In other words, denoting by g(r) this number of items in the source on rank r, there exists a constant D Ͼ 0 such thatAlthough their informetric definitions are different (they are dual in the sense that, in f and g, the roles of sources and items are interchanged), the functions 1 and 2 are mathematically the same, namely decreasing power laws. A power law is the most occurring regularity in informetrics and far beyond (e.g., also found in economics and sociology, including the description of social networks, see e.g., Egghe and Rousseau, 2003 and references therein) and has the characterizing property (see Roberts, 1979) that, if the argument (x) is multiplied by a constant, k, we obtain the same power law with the same exponent: for we have (ϳ denotes is proportional to). This self-similar property explains its widespread occurrence in real-life examples and also its use in the description of power type informetrics in terms of self-similar fractals (see Feder, 1988). Intuitively, the self-similarity expresses that, independent of the scale at which we examine an "object...