Published online by Cambridge University Press: 09 April 2001
Consider sequences {Xi}mi=1 and {Yj}nj=1 of independent random variables, taking values in a finite alphabet, and assume that the variables X1, X2, … and Y1, Y2, … follow the distributions μ and v, respectively. Two variables Xi and Yj are said to match if Xi = Yj. Let the number of matching subsequences of length k between the two sequences, when r, 0 [les ] r < k, mismatches are allowed, be denoted by W.
In this paper we use Stein's method to bound the total variation distance between the distribution of W and a suitably chosen compound Poisson distribution. To derive rates of convergence, the case where E[W] stays bounded away from infinity, and the case where E[W] → ∞ as m, n → ∞, have to be treated separately. Under the assumption that ln n/ln(mn) → ρ ∈ (0, 1), we give conditions on the rate at which k → ∞, and on the distributions μ and v, for which the variation distance tends to zero.