Published online by Cambridge University Press: 01 January 2025
In categorical data analysis, two-sample cross-validation is used not only for model selection but also to obtain a realistic impression of the overall predictive effectiveness of the model. The latter is of particular importance in the case of highly parametrized models capable of capturing every idiosyncracy of the calibrating sample. We show that for maximum likelihood estimators or other asymptotically efficient estimators Pearson’s X2 is not asymptotically chi-square in the two-sample cross-validation framework due to extra variability induced by using different samples for estimation and goodness-of-fit testing. We propose an alternative test statistic, X2xval, obtained as a modification of X2 which is asymptotically chi-square with C - 1 degrees of freedom in cross-validation samples. Stochastically, X2xval≤ X2. Furthermore, the use of X2 instead of X2xval with a χ2C - 1 reference distribution may provide an unduly poor impression of fit of the model in the cross-validation sample.
This paper is dedicated to the memory of Michael V. Levine.