Hostname: page-component-cd9895bd7-dk4vv Total loading time: 0 Render date: 2024-12-27T09:09:11.181Z Has data issue: false hasContentIssue false

Arcsine laws for random walks generated from random permutations with applications to genomics

Published online by Cambridge University Press:  22 November 2021

Xiao Fang*
Affiliation:
The Chinese University of Hong Kong
Han L. Gan*
Affiliation:
Northwestern University
Susan Holmes*
Affiliation:
Stanford University
Haiyan Huang*
Affiliation:
University of California, Berkeley
Erol Peköz*
Affiliation:
Boston University
Adrian Röllin*
Affiliation:
National University of Singapore
Wenpin Tang*
Affiliation:
Columbia University
*
*Postal address: Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. Email: xfang@sta.cuhk.edu.hk
**Postal address: University of Waikato, Private Bag 3105, Hamilton 3240, New Zealand. Email: han.gan@waikato.ac.nz
***Postal address: Department of Statistics, 390 Jane Stanford Way, Stanford University, Stanford, CA 94305-4020. Email: susan@stat.stanford.edu
****Postal address: Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, CA 94720-3860. Email: hhuang@stat.berkeley.edu
*****Postal address: Boston University, Questrom School of Business, Rafik B. Hariri Building, 595 Commonwealth Avenue, Boston, MA 02215. Email: pekoz@bu.edu
******Postal address: Department of Statistics and Applied Probability, National University of Singapore, 6 Science Drive 2, Singapore 117546. Email: adrian.roellin@nus.edu.sg
*******Postal address: Department of Industrial Engineering and Operations Research, Columbia University, 500 W. 120th Street #315, New York, NY 10027. Email: wt2319@columbia.edu

Abstract

A classical result for the simple symmetric random walk with 2n steps is that the number of steps above the origin, the time of the last visit to the origin, and the time of the maximum height all have exactly the same distribution and converge when scaled to the arcsine law. Motivated by applications in genomics, we study the distributions of these statistics for the non-Markovian random walk generated from the ascents and descents of a uniform random permutation and a Mallows(q) permutation and show that they have the same asymptotic distributions as for the simple random walk. We also give an unexpected conjecture, along with numerical evidence and a partial proof in special cases, for the result that the number of steps above the origin by step 2n for the uniform permutation generated walk has exactly the same discrete arcsine distribution as for the simple random walk, even though the other statistics for these walks have very different laws. We also give explicit error bounds to the limit theorems using Stein’s method for the arcsine distribution, as well as functional central limit theorems and a strong embedding of the Mallows(q) permutation which is of independent interest.

Type
Original Article
Copyright
© The Author(s), 2021. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Akahori, J. (1995). Some formulae for a new type of path-dependent option. Ann. Appl. Prob. 5, 383388.10.1214/aoap/1177004769CrossRefGoogle Scholar
Andersen, E. S. (1953). On sums of symmetrically dependent random variables. Skand. Aktuarietidskr. 36, 123138.Google Scholar
Arratia, R., Barbour, A. and Tavaré, S. (2003). Logarithmic Combinatorial Structures: A Probabilistic Approach (EMS Monogr. Math. 1). EMS Publishing House, Zurich.Google Scholar
Barlow, M., Pitman, J. and Yor, M. (1989). Une extension multidimensionnelle de la loi de l’arc sinus. In Séminaire de Probabilités (Lect. Notes Math. 23). Springer, Berlin, pp. 294–314.10.1007/BFb0083980CrossRefGoogle Scholar
Bassino, F., Bouvel, M., Féray, V., Gerin, L. and Pierrot, A. (2018). The Brownian limit of separable permutations. Ann. Prob. 46, 21342189.10.1214/17-AOP1223CrossRefGoogle Scholar
Basu, R. and Bhatnagar, N. (2017). Limit theorems for longest monotone subsequences in random Mallows permutations. Ann. Inst. H. Poincaré Prob. Statist. 53, 19341951.10.1214/16-AIHP777CrossRefGoogle Scholar
Bernardi, O., Duplantier, B. and Nadeau, P. (2010). A bijection between well-labelled positive paths and matchings. Séminaire Lotharingien de Combinatoire 63, B63e.Google Scholar
Bertoin, J. and Doney, R. (1997). Spitzer’s condition for random walks and Lévy processes. Ann. Inst. H. Poincaré Prob. Statist. 33, 167178.10.1016/S0246-0203(97)80120-3CrossRefGoogle Scholar
Bertoin, J. (1993). Splitting at the infimum and excursions in half-lines for random walks and Lévy processes. Stoch. Process. Appl. 47, 1735.10.1016/0304-4149(93)90092-ICrossRefGoogle Scholar
Bhattacharjee, C. and Goldstein, L. (2016). On strong embeddings by Stein’s method. Electron. J. Prob. 21, 130.10.1214/16-EJP4299CrossRefGoogle Scholar
Billingsley, P. (1956). The invariance principle for dependent random variables. Trans. Amer. Math. Soc. 83, 250268.10.1090/S0002-9947-1956-0090923-6CrossRefGoogle Scholar
Billingsley, P. (1999). Convergence of Probability Measures, 2nd ed. John Wiley, New York.10.1002/9780470316962CrossRefGoogle Scholar
Bingham, N. and Doney, R. (1988). On higher-dimensional analogues of the arc-sine law. J. Appl. Prob. 25, 120131.10.2307/3214239CrossRefGoogle Scholar
Borodin, A., Diaconis, P. and Fulman, J. (2010). On adding a list of numbers (and other one-dependent determinantal processes). Bull. Amer. Math. Soc. 47, 639670.10.1090/S0273-0979-2010-01306-9CrossRefGoogle Scholar
Carlitz, L. (1973). Permutations with prescribed pattern. Math. Nachr. 58, 3153.10.1002/mana.19730580104CrossRefGoogle Scholar
Chatterjee, S. (2007). Stein’s method for concentration inequalities. Prob. Theory Relat. Fields 138, 305321.10.1007/s00440-006-0029-yCrossRefGoogle Scholar
Chatterjee, S. (2012). A new approach to strong embeddings. Prob. Theory Relat. Fields 152, 231264.10.1007/s00440-010-0321-8CrossRefGoogle Scholar
Chatterjee, S. and Diaconis, P. (2017). A central limit theorem for a new statistic on permutations. Indian J. Pure Appl. Math. 48, 561573.10.1007/s13226-017-0246-3CrossRefGoogle Scholar
Chung, K. L. and Feller, W. (1949). On fluctuations in coin-tossing. Proc. Nat. Acad. Sci. 35, 605608.10.1073/pnas.35.10.605CrossRefGoogle Scholar
Csörgö, M. and Révész, P. (1975). A new method to prove Strassen type laws of invariance principle. I. Z. Wahrscheinlichkeitsth. 31, 255259.10.1007/BF00532865CrossRefGoogle Scholar
de Bruijn, N. G. (1970). Permutations with given ups and downs. Nieuw Arch. Wiskd. 18, 6165.Google Scholar
de Valk, V. (1994). One-Dependent Processes: Two-Block-Factors and Non-Two-Block-Factors (CWI tracts 85). Centrum voor Wiskunde en Informatica, Amsterdam.Google Scholar
Diaconis, P. (1988). Group Representations in Probability and Statistics (Lect. Notes Monogr. 11). Institute of Mathematics and Statistics, Hayward, CA.Google Scholar
Döbler, C. (2012). A rate of convergence for the arcsine law by Stein’s method. Preprint, arXiv:1207.2401.Google Scholar
Dynkin, E. B. (1965). Markov Processes Vols. I, II (Grundlehren der Mathematischen Wissenschaften 121, 122). Springer, Berlin.Google Scholar
Erdös, P. and Kac, M. (1947). On the number of positive sums of independent random variables. Bull. Amer. Math. Soc. 53, 10111020.10.1090/S0002-9904-1947-08928-XCrossRefGoogle Scholar
Fang, X. (2019). Wasserstein-2 bounds in normal approximation under local dependence. Electron. J. Prob. 24, 114.10.1214/19-EJP301CrossRefGoogle Scholar
Feller, W. (1968). An Introduction to Probability Theory and Its Applications, 2nd ed. Vol. I. John Wiley, New York.Google Scholar
Gan, H. L., Röllin, A. and Ross, N. (2017). Dirichlet approximation of equilibrium distributions in Cannings models with mutation. Adv. Appl. Prob. 49, 927959.10.1017/apr.2017.27CrossRefGoogle Scholar
Getoor, R. and Sharpe, M. (1994). On the arc-sine laws for Lévy processes. J. Appl. Prob. 31, 7689.10.2307/3215236CrossRefGoogle Scholar
Gladkich, A. and Peled, R. (2018). On the cycle structure of Mallows permutations. Ann. Prob. 46, 11141169.10.1214/17-AOP1202CrossRefGoogle Scholar
Gnedin, A. and Olshanski, G. (2006). Coherent permutations with descent statistic and the boundary problem for the graph of zigzag diagrams. Int. Math. Res. Not. 2006, 51968.Google Scholar
Gnedin, A. and Olshanski, G. (2010). q-exchangeability via quasi-invariance. Ann. Prob. 38, 21032135.10.1214/10-AOP536CrossRefGoogle Scholar
Goldstein, L. and Reinert, G. (2013). Stein’s method for the beta distribution and the Polya–Eggenberger urn. J. Appl. Prob. 50, 11871205.10.1017/S0021900200013875CrossRefGoogle Scholar
Hoffman, C., Rizzolo, D. and Slivken, E. (2017a). Pattern-avoiding permutations and Brownian excursion part I: Shapes and fluctuations. Random Structures Algorithms 50, 394419.10.1002/rsa.20677CrossRefGoogle Scholar
Hoffman, C., Rizzolo, D. and Slivken, E. (2017b). Pattern-avoiding permutations and Brownian excursion, part II: Fixed points. Prob. Theory Relat. Fields 169, 377424.10.1007/s00440-016-0732-2CrossRefGoogle Scholar
Holroyd, A., Hutchcroft, T. and Levy, A. (2020). Mallows permutations and finite dependence. Ann. Prob. 48, 343379.CrossRefGoogle Scholar
Janson, S. (2017). Patterns in random permutations avoiding the pattern 132. Combinatorics Prob. Comput. 26, 2451.10.1017/S0963548316000171CrossRefGoogle Scholar
Karatzas, I. and Shreve, S. E. (1987). A decomposition of the Brownian path. Statist. Prob. Lett. 5, 8793.10.1016/0167-7152(87)90061-7CrossRefGoogle Scholar
Kasahara, Y. and Yano, Y. (2005). On a generalized arc-sine law for one-dimensional diffusion processes. Osaka J. Math. 42, 110.Google Scholar
Komlós, J., Major, P. and Tusnády, G. (1975). An approximation of partial sums of independent RVs, and the sample DF. I. Z. Wahrscheinlichkeitsth. 32, 111131.10.1007/BF00533093CrossRefGoogle Scholar
Komlós, J., Major, P. and Tusnády, G. (1976). An approximation of partial sums of independent RVs, and the sample DF. II. Z. Wahrscheinlichkeitsth. 34, 3358.10.1007/BF00532688CrossRefGoogle Scholar
Lévy, P. (1939). Sur certains processus stochastiques homogènes. Compositio Math. 7, 283339.Google Scholar
Lévy, P. (1965). Processus stochastiques et mouvement brownien. Suivi d’une note de M. Loève. Deuxième édition revue et augmentée. Gauthier-Villars & Cie, Paris.Google Scholar
McDonald, J. H. (2009). Handbook of Biological Statistics. Sparky House Publishing, Baltimore, MD.Google Scholar
MacMahon, P. A. (1960). Combinatory Analysis. Chelsea Publishing Co., New York.Google Scholar
Mallows, C. L. (1957). Non-null ranking models. I. Biometrika 44, 114130.CrossRefGoogle Scholar
Niven, I. (1968). A combinatorial problem of finite sequences. Nieuw Arch. Wisk 16, 116123.Google Scholar
Oshanin, G. and Voituriez, R. (2004). Random walk generated by random permutations of $\{1, 2, 3, \ldots, n+1\}$ . J. Phys. A 37, 6221.10.1088/0305-4470/37/24/002CrossRefGoogle Scholar
Pitman, J. (2006). Combinatorial Stochastic Processes (Lect. Notes Math. 1875). Springer, Berlin.Google Scholar
Pitman, J. (2018). Random weighted averages, partition structures and generalized arcsine laws. Preprint, arXiv:1804.07896.Google Scholar
Pitman, J. and Tang, W. (2019). Regenerative random permutations of integers. Ann. Prob. 47, 13781416.10.1214/18-AOP1286CrossRefGoogle Scholar
Pitman, J. and Yor, M. (1992). Arcsine laws and interval partitions derived from a stable subordinator. Proc. London Math. Soc. 65, 326356.10.1112/plms/s3-65.2.326CrossRefGoogle Scholar
Rogers, L. C. G. and Williams, D. (1987). Diffusions, Markov Processes, and Martingales, Vol. 2. John Wiley, New York.Google Scholar
Salari, K., Tibshirani, R. and Pollack, J. R. (2010). DR-Integrator: A new analytic tool for integrating DNA copy number and gene expression data. Bioinformatics 26, 414416.CrossRefGoogle ScholarPubMed
Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics. John Wiley, New York.Google Scholar
Skorokhod, A. V. (1965). Studies in the Theory of Random Processes. Addison-Wesley Publishing Co., Inc., Reading, MA.Google Scholar
Stanley, R. (1976). Binomial posets, Möbius inversion, and permutation enumeration. J. Combinatorial Theory A 20, 336356.10.1016/0097-3165(76)90028-5CrossRefGoogle Scholar
Stanley, R. (1999). Enumerative Combinatorics, Vol. 2 (Camb. Studies Adv. Math. 62). Cambridge University Press.CrossRefGoogle Scholar
Starr, S. (2009). Thermodynamic limit for the Mallows model on ${S}_n$ . J. Math. Phys. 50, 095208.10.1063/1.3156746CrossRefGoogle Scholar
Strassen, V. (1967). Almost sure behavior of sums of independent random variables and martingales. In Proc. Fifth Berkeley Symp. Math. Statist. Prob., Vol. 2.Google Scholar
Takács, L. (1996). On a generalization of the arc-sine law. Ann. Appl. Prob. 6, 10351040.Google Scholar
Tang, W. (2019). Mallows ranking models: Maximum likelihood estimate and regeneration. Proc. Mach. Learn. Res. 97, 61256134.Google Scholar
Tarrago, P. (2018). Zigzag diagrams and Martin boundary. Ann. Prob. 46, 25622620.10.1214/17-AOP1234CrossRefGoogle Scholar
Viennot, G. (1979). Permutations ayant une forme donnée. Discrete Math. 26, 279284.10.1016/0012-365X(79)90035-9CrossRefGoogle Scholar
Wang, R., Waterman, M. and Huang, H. (2014). Gene coexpression measures in large heterogeneous samples using count statistics. Proc. Nat. Acad. Sci. 111, 1637116376.10.1073/pnas.1417128111CrossRefGoogle Scholar
Wang, R., Liu, K., Theusch, E., Rotter, J., Medina, M., Waterman, M. and Huang, H. (2017). Generalized correlation measure using count statistics for gene expression data with ordered samples. Bioinformatics 34, 617624.CrossRefGoogle Scholar
Watanabe, S. (1995). Generalized arc-sine laws for one-dimensional diffusion processes and random walks. In Proc. Symp. Pure Math., Vol. 57, pp. 157172. American Mathematical Society, Providence, RI.10.1090/pspum/057/1335470CrossRefGoogle Scholar
Williams, D. (1969). Markov properties of Brownian local time. Bull. Amer. Math. Soc. 75, 10351036.CrossRefGoogle Scholar