Published online by Cambridge University Press: 01 January 2025
An information-theoretic framework is used to analyze the knowledge content in multivariate cross classified data. Several related measures based directly on the information concept are proposed: the knowledge content (S) of a cross classification, its terseness (Zeta), and the separability (GammaX) of one variable, given all others. Exemplary applications are presented which illustrate the solutions obtained where classical analysis is unsatisfactory, such as optimal grouping, the analysis of very skew tables, or the interpretation of well-known paradoxes. Further, the separability suggests a solution for the classic problem of inductive inference which is independent of sample size.
Lucien Preuss gratefully acknowledges the support of the Swiss National Science Foundation (grant 21-25'757.88) which provided the initial impetus for this work. Also, we owe more than the usual thanks to the Editor, Shizuhiko Nishisato, whose relentless demands for clarity resulted in quite a few improvements.