Published online by Cambridge University Press: 13 December 2013
Consistency over time of (on-farm) animal welfare assessment systems forms part of reliability, meaning that results of the assessment should be representative of the longer-term welfare state of the farm as long as the housing and management conditions have not changed considerably. This is especially important if assessments are to be used for certification purposes. It was the aim of the present study to investigate consistency over time of the Welfare Quality® (WQ®) assessment system for fattening cattle at single measure level, aggregated criterion and principle scores, and overall classification across short-term (1 month) and longer-term periods (6 months). We hypothesized that consistency over time of aggregated criterion and principle scores is higher than that of single measures. Consistency was also expected to be lower with longer intervals between assessments. Data were obtained using the WQ® protocol for fattening cattle during three visits (months 0, 1 and 7) on 63 beef farms in Austria, Germany and Italy. Only data from farms where no major changes in housing and management had taken place were considered for analysis. At the single measure level, Spearman rank correlations between visits were >0.7 and variance was lower within farms than between farms for six and two of 19 measures after 1 month and 6 months, respectively. After aggregation of single measures into criterion and principle scores, five and two of 10 criteria and three and one of four principles were found reliable after 1 and 6 months, respectively. At the WQ® principle level, this was the case for three and one of four principles. Seventy-nine per cent and 75% of the farms were allocated to the same overall welfare category after 1 month and 6 months. Possible reasons for a lack of consistency are seasonal effects or short-term fluctuations that occur under normal farm conditions, low prevalence of clinical measures and probably insufficient sample size, whereas poor inter-observer agreement leading to inflation of correlation can be ruled out. At the criterion and principle level, aggregation of information into scores appears to partly smoothen undirected variation at the single measure level without losing sensitivity in terms of welfare evaluation. Reliable on-farm animal welfare assessments should therefore be based on repeated assessments. Further long-term studies are recommended to better understand the factors influencing consistency over time.