Hostname: page-component-78c5997874-m6dg7 Total loading time: 0 Render date: 2024-11-10T16:09:15.922Z Has data issue: false hasContentIssue false

Convergence Properties in Certain Occupancy Problems Including the Karlin-Rouault Law

Published online by Cambridge University Press:  14 July 2016

Estáte V. Khmaladze*
Affiliation:
Victoria University of Wellington
*
Postal address: School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, PO Box 600, Wellington, 2052, New Zealand. Email address: estate.khmaladze@msor.vuw.ac.nz
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Let x denote a vector of length q consisting of 0s and 1s. It can be interpreted as an ‘opinion’ comprised of a particular set of responses to a questionnaire consisting of q questions, each having {0, 1}-valued answers. Suppose that the questionnaire is answered by n individuals, thus providing n ‘opinions’. Probabilities of the answer 1 to each question can be, basically, arbitrary and different for different questions. Out of the 2q different opinions, what number, μn , would one expect to see in the sample? How many of these opinions, μn (k), will occur exactly k times? In this paper we give an asymptotic expression for μn / 2q and the limit for the ratios μn (k)/μn , when the number of questions q increases along with the sample size n so that n = λ2q , where λ is a constant. Let p(x ) denote the probability of opinion x . The key step in proving the asymptotic results as indicated is the asymptotic analysis of the joint behaviour of the intensities np(x ). For example, one of our results states that, under certain natural conditions, for any z > 0, ∑1 {np( x) > z} = d n z u ,d n = o(2q ).

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 2011 

References

[1] Baayen, R. H. (2002). Word Frequency Distribution. Kluwer, Dordrecht.Google Scholar
[2] Bahadur, R. R. and Ranga Rao, R. (1960). On deviations of the sample mean. Ann. Math. Statist. 31, 10151027.Google Scholar
[3] Barbour, A. D. and Gnedin, A. V. (2009). Small counts in the infinite occupancy scheme. Electron. J. Prob. 14, 365384.Google Scholar
[4] Chaganty, N. R. and Sethuraman, J. (1993). Strong large deviation and local limit theorems. Ann. Prob. 21, 16711690.Google Scholar
[5] Feller, W. (1986). Introduction to Probability Theory, Vol. 2. John Wiley, New York.Google Scholar
[6] Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrica 40, 237264.Google Scholar
[7] Greenwood, P. E. and Shiryaev, A. N. (1985). Contiguity and the Statistical Invariance Principle. Gordon and Breach, New York.Google Scholar
[8] Hwang, H.-K. and Janson, S. (2008). Local limit theorems for finite and infinite urn models. Ann. Prob. 36, 9921022.Google Scholar
[9] Ivanov, V. A., Ivchenko, G. I. and Medvedev, Y. I. (1985). Discrete problems of probability theory (a survey). J. Soviet Math. 31, 27592795.Google Scholar
[10] Kallenberg, O. (1997). Foundations of Modern Probability. Springer, New York.Google Scholar
[11] Khmaladze, È. V. (1983). Martingale limit theorems for divisible statistics. Theory Prob. Appl. 28, 530549.Google Scholar
[12] Khmaladze, È. V. (1988). The statistical analysis of a large number of rare events. Tech. Rep. MS-R8804, CWI, Amsterdam.Google Scholar
[13] Khmaladze, È. V. (2002). Zipf's law. In Encyclopaedia of Mathematics, Supplement III, Kluwer, Dordrecht.Google Scholar
[14] Khmaladze, È. V. and Tsigroshvili, Z. P. (1993). On polynomial distributions with a large number of rare events. Math. Meth. Statist. 2, 240247.Google Scholar
[15] Klaassen, C. A. J. and Mnatsakanov, R. M. (2000). Consistent estimation of the structural distribution function. Scand. J. Statist. 27, 733746.Google Scholar
[16] Kolassa, J. E. (1994). Series Approximation Methods in Statistics (Lecture Notes Statist. 88), Springer, New York.Google Scholar
[17] Kolchin, V. F., Sevastyanov, B. A. and Chistyakov, V. P. (1978). Random Allocations. Halsted Press, New York.Google Scholar
[18] Laplace, P.-S. (1995). Philosophical Essays on Probability (translation of 5th (1825) French edn.). Springer, New York.Google Scholar
[19] McAllester, D. and Schapire, R. E. (2000). On the convergence rate of Good–Turing estimators. In Proc. COLT 2000, pp. 16.Google Scholar
[20] Mirakhmedov, S. M. (2007). Asymptotic normality associated with generalized occupancy problem. Statist. Prob. Lett. 77, 15491558.Google Scholar
[21] Mnatsakanov, R. M. (1986). Functional limit theorem for additively separable statistics in the case of very rare events. Theory Prob. Appl. 30, 622631.Google Scholar
[22] Oosterhoff, J. and van Zwet, W. R. (1979). A note on contiguity and Hellinger distance. In Contributions to Statistics, ed. Jurechkova, J., Reidel, Dordrecht, pp. 157166.Google Scholar
[23] Orlitsky, A., Santhanam, N. P. and Zhang, J. (2003). Always good Turing: asymptotically optimal probability estimation. Science 302, 427431.Google Scholar
[24] Rouault, A. (1978). Loi de Zipf et sources markoviennes. Ann. Inst. H. Poincaré 14, 169188.Google Scholar