Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-27T22:50:56.937Z Has data issue: false hasContentIssue false

On Compound Poisson Approximation for Sequence Matching

Published online by Cambridge University Press:  09 April 2001

MARIANNE MÅNSSON
Affiliation:
Department of Mathematics, Chalmers University of Technology, SE 412 96 Göteborg, Sweden (e-mail: marianne@math.chalmers.se)

Abstract

Consider sequences {Xi}mi=1 and {Yj}nj=1 of independent random variables, taking values in a finite alphabet, and assume that the variables X1, X2, … and Y1, Y2, … follow the distributions μ and v, respectively. Two variables Xi and Yj are said to match if Xi = Yj. Let the number of matching subsequences of length k between the two sequences, when r, 0 [les ] r < k, mismatches are allowed, be denoted by W.

In this paper we use Stein's method to bound the total variation distance between the distribution of W and a suitably chosen compound Poisson distribution. To derive rates of convergence, the case where E[W] stays bounded away from infinity, and the case where E[W] → ∞ as m, n → ∞, have to be treated separately. Under the assumption that ln n/ln(mn) → ρ ∈ (0, 1), we give conditions on the rate at which k → ∞, and on the distributions μ and v, for which the variation distance tends to zero.

Type
Research Article
Copyright
2000 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)