This study aimed to identify genetic evaluation models (GEM) to accurately select cattle for milk production when only limited data are available. It is based on a data set from the Pakistani Sahiwal progeny testing programme which includes records from five government herds, each consisting of 100 to 350 animals, with lactation records dating back to 1968. Different types of GEM were compared, namely: (1) multivariate v. repeatability model when using the first three lactations, (2) an animal v. a sire model, (3) different fixed effects models to account for effects such as herd, year and season; and (4) fitting a model with genetic parameters fixed v. estimating the genetic parameters as part of the model fitting process. Two methods were used for the comparison of models. The first method used simulated data based on the Pakistani progeny testing system and compared estimated breeding values with true breeding values. The second method used cross-validation to determine the best model in subsets of actual Australian herd-recorded data. Subsets were chosen to reflect the Pakistani data in terms of herd size and number of herds. Based on the simulation and the cross-validation method, the multivariate animal model using fixed genetic parameters was generally the superior GEM, but problems arise in determining suitable values for fixing the parameters. Using mean square error of prediction, the best fixed effects structure could not be conclusively determined. The simulation method indicated the simplest fixed effects structure to be superior whereas in contrast, the cross-validation method on actual data concluded that the most complex one was the best. In conclusion it is difficult to propose a universally best GEM that can be used in any data set of this size. However, some general recommendations are that it is more appropriate to estimate the genetic parameters when evaluating for selection purposes, the animal model was superior to the sire model and that in the Pakistani situation the repeatability model is more suitable than a multivariate.