This article encourages the use of explicit methods in linguistics by attempting to estimate the size of a linguistic data set. Such estimations are difficult because redundant data can easily pad the data set. To address this, I offer some explicit operationalizations of the data and their features. For linguistic data, negative associations do not indicate true redundancy, and yet for many measures they can be mathematically impossible to ignore. It is proven that this troublesome phenomenon has positive Lebesgue measure and is monotonically increasing and that these two features hold robustly in four different ways.