Published online by Cambridge University Press: 14 July 2016
This study is motivated by problems of molecular sequence comparison for multiple marker arrays with correlated distributions. In this paper, the model assumes two (or more) kinds of markers, say Markers A and B, distributed along the DNA sequence. The two primary conditions of interest are (i) many of Marker B (say ≥ m) occur, and (ii) few of Marker B (say ≤ l) occur. We title these the conditional r-scan models, and inquire on the extent to which Marker A clusters or is over-dispersed in regions satisfying condition (i) or (ii). Limiting distributions for the extremal r-scan statistics from the A array satisfying conditions (i) and (ii) are derived by extending the Chen-Stein Poisson approximation method.
Supported in part by NIH Grant 2R01HG00335-11 and 5R01GM10452-35.