On the Distribution of the Number of Missing Words in Random Texts

SVEN RAHMANN; ERIC RIVALS

doi:10.1017/S0963548302005473

On the Distribution of the Number of Missing Words in Random Texts

Published online by Cambridge University Press: 28 January 2003

SVEN RAHMANN and

ERIC RIVALS

Show author details

SVEN RAHMANN: Affiliation:
Department of Computational Molecular Biology, Max-Planck-Institut für Molekulare Genetik, Ihnestraße 63-73, D-14195 Berlin, GermanySven.Rahmann@molgen.mpg.de
ERIC RIVALS: Affiliation:
L.I.R.M.M., CNRS U.M.R. 5506, 161 rue Ada, F-34392 Montpellier Cedex 5, Francerivals@lirmm.fr

Article contents

Abstract

Get access

Rights & Permissions

Abstract

Determining the distribution of the number of empty urns after a number of balls have been thrown randomly into the urns is a classical and well understood problem. We study a generalization: Given a finite alphabet of size σ and a word length q, what is the distribution of the number X of words (of length q) that do not occur in a random text of length n+q−1 over the given alphabet? For q=1, X is the number Y of empty urns with σ urns and n balls. For q[ges ]2, X is related to the number Y of empty urns with σq urns and n balls, but the law of X is more complicated because successive words in the text overlap. We show that, perhaps surprisingly, the laws of X and Y are not as different as one might expect, but some problems remain currently open.

Information

Type: Research Article
Information: Combinatorics, Probability and Computing , Volume 12 , Issue 1 , January 2003 , pp. 73 - 87

DOI: https://doi.org/10.1017/S0963548302005473 [Opens in a new window]

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article contents

On the Distribution of the Number of Missing Words in Random Texts

Abstract

Information

Access options

Article purchase

Temporarily unavailable

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

On the Distribution of the Number of Missing Words in Random Texts

Abstract

Information

Access options

Article purchase

Temporarily unavailable

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests