Published online by Cambridge University Press: 18 August 2016
With an increase in the number of candidate genes for important traits in livestock, effective strategies for incorporating such genes into selection programmes are increasingly important. Those strategies in part depend on the frequency of a favoured allele in a population. Since comprehensive genotyping of a population is seldom possible, we investigate the consequences of sampling strategies on the reliability of the gene frequency estimate for a bi-allelic locus. Even within a subpopulation or line, often only a proportion of individuals will be genotype tested. However, through segregation analysis, probable genotypes can be assigned to individuals that themselves were not tested, using known genotypes on relatives and a starting (presumed) gene frequency. The value of these probable genotypes in estimation of gene frequency was considered. A subpopulation or line was stochastically simulated and sampled at random, over a cluster of years or by favouring a particular genotype. Line was simulated (replicated) 1000 times. The reliability of gene frequency estimates depended on the sampling strategy used. With random sampling, even when a small proportion of a line was genotyped (0·10), the gene frequency of the population was well estimated from the across-line mean. When information on probable genotypes on untested individuals was combined with known genotypes, the between-line variance in gene frequency was estimated well; including probable genotypes overcame problems of statistical sampling. When the sampling strategy favoured a particular genotype, unsurprisingly the estimate of gene frequency was biased towards the allele favoured. In using probable genotypes the bias was lessened but the estimate of gene frequency still reflected the sampling strategy rather than the true population frequency. When sampling was confined to a few clustered years, the estimation of gene frequency was biased for those generations preceding the sampling event, particularly when the presumed starting gene frequency differed from the true population gene frequency. The potential risks of basing inferences about a population from a potentially biased sample are discussed.