Statistical methods are presented to facilitate a more complete analysis of results obtained when a scaling model is applied to data from two or more groups. These methods can be used to (a) compare the corresponding estimated latent distributions obtained using the scaling model applied to the different groups, (b) compare the corresponding estimated item reliabilities (or item response error rates) for the different groups, and (c) test whether the scaling model applied to the several groups can be replaced by a more parsimonious scaling model that includes various homogeneity constraints (i.e., constraints that describe which parameters in the model are the same for the several groups). Various kinds of scaling models are considered here in the multiple-group context.