INTRODUCTION
Thanks to advances in the modelling of coupled ocean-atmosphere dynamics, ensemble predictions of the atmospheric state over a season have become well established and are operationally issued by many weather services. The potential benefits from long-range weather predictions in agricultural decision problems have been pointed out by many authors (e.g. Hansen Reference Hansen2005; Meinke & Stone Reference Meinke and Stone2005; Sivakumar Reference Sivakumar2006). This holds true, in particular, for regions where the impact of the El Niño/Southern Oscillation (ENSO) on the local or regional climate is pronounced, a condition that has prompted a number of initiatives in developing countries (e.g. Harrison et al. Reference Harrison, Kanga, Magrin, Hugo, Tarakidzwa, Mullen and Meinke2007; Patt et al. Reference Patt, Ogallo and Hellmuth2007).
In view of the possible impacts of climate change on agriculture, long-range probabilistic forecasts could also represent a key element of adaptation, above all in relation to what Olesen & Bindi (Reference Olesen and Bindi2002) call short-adjustments, i.e. autonomous actions that can be implemented without major system changes. Examples of weather-sensitive decision problems of this sort are the choice of crops and crop sequences, adjustments in the cropping calendar, scheduling of irrigation and fertilizer applications, application of pesticides, and so on (Wilks Reference Wilks, Katz and Murphy1997; Meinke & Stone Reference Meinke and Stone2005). In practice, however, the decision process often requires precise weather forecasts.
In Europe, monthly and seasonal probabilistic weather forecasts are far from being systematically employed for guiding farming operations or helping the decision-making process in extension services. This may be because recent studies provide a somewhat inconclusive picture concerning the utility of the forecasts. On the one hand, investigations conducted in the framework of the DEMETER project (Development of a European Multimodel Ensemble System for Seasonal to Interannual Climate Prediction; Palmer et al. Reference Palmer, Alessandri, Andersen, Cantelaube, Davey, Délécluse, Déqué, Díez, Doblas-Reyes, Feddersen, Graham, Gualdi, Guérémy, Hagedorn, Hoshen, Keenlyside, Latif, Lazar, Maisonnave, Marletto, Morse, Orfila, Rogel, Terres and Thomson2004) seem to justify an optimistic attitude, at least in relation to applications at the regional and national scale (Cantelaube & Terres Reference Cantelaube and Terres2005; Marletto et al. Reference Marletto, Zinoni, Criscuolo, Fontana, Marchesi, Morgillo, Van Soetendael, Ceotto and Andersen2005; Marletto et al. Reference Marletto, Ventura, Fontana and Tomei2007). On the other hand, little benefit has been found at the local scale (Semenov & Doblas-Reyes, Reference Semenov and Doblas-Reyes2007).
Since DEMETER, some progress has been made in relation to long-range weather predictions, in particular, with respect to ensemble forecasts over a month. Compared to short- and medium-range forecasts, monthly forecasts represent a significant extension of the prediction horizon and could, therefore, provide a valuable source of information for a variety of decision problems, in particular, operations that do not benefit from longer forecasts (Lawless & Semenov Reference Lawless and Semenov2005).
The purpose of the present paper is to highlight the possibilities for applying long-range weather forecasts to agricultural decision problems in Europe. After reviewing the quality of the monthly forecasts issued by the European Centre for Medium-Range Weather Forecasts (ECMWF), the study examines whether bias correction of the precipitation forecasts by statistical methods provide advantages in terms of predictive ability. As an example of an agricultural application, monthly forecasts are considered for predicting soil water availability at the local scale. A more general discussion on the research and technical needs related to the set up of agricultural decision systems based on long-range forecasts concludes the paper.
QUALITY OF ENSEMBLE MONTHLY FORECASTS OVER EUROPE: A REVIEW
The ECMWF monthly ensemble forecasting system used in the present study has been described in length by Vitart (Reference Vitart2004) and the reader should refer to that paper for detailed information. In short, the system consists of a coupled ocean-atmosphere global circulation model. The ocean component is the Hamburg Primitive Equation Model (HOPE; Wolff et al. Reference Wolff, Maier-Raimer and Legutke1997). It is run at a horizontal resolution of 1·4°, corresponding to c. 155 km outside the tropics. The atmospheric component is the ECMWF atmospheric model integrated forecast system (IFS). It is run at a horizontal resolution of 1·125° corresponding to c. 125 km outside the tropics.
Forecasts for the forthcoming 32 days are issued once a week. They are set up as ensemble forecasts with 51 individual members. Along with each real-time forecast, five-member ensemble re-forecasts with the same starting day of the year and lead-time are generated for the previous 12 years to provide a corresponding climatology. The smaller size of the re-forecasts has to be taken into account in evaluations of the prediction skill (Müller et al. Reference Müller, Appenzeller, Doblas-Reyes and Liniger2005; Weigel et al. Reference Weigel, Liniger and Appenzeller2007). The ECMWF issues the monthly forecasts as weekly means, with forecasted fields for weeks 1, 2, 3 and 4 corresponding to averages over days 5–11, 12–18, 19–25 and 26–32, respectively.
The performance of the ECMWF monthly ensemble forecasting system in relation to near-surface temperature has been systematically examined by Weigel et al. (Reference Weigel, Baggenstos, Liniger, Vitart and Appenzeller2008), who found that over Europe the model develops a substantial negative bias (up to −2 K) during week 1. The bias amplifies as the integration proceeds but the error growth saturates after about 20 days of integration.
Decreasing performance over time is also apparent when considering skill metrics such as the de-biased ranked probability skill score, as defined in Weigel et al. (Reference Weigel, Baggenstos, Liniger, Vitart and Appenzeller2008). Over Europe, prediction skill exceeds a value of 0·3 during week 1, dropping below 0·1 during week 2. However, even during weeks 3 and 4 prediction skill remains mostly positive, suggesting that the forecasts are only rarely worse than climatology.
As discussed by Rodwell & Doblas-Reyes (Reference Rodwell and Doblas-Reyes2006), analysis of skill on the basis of the weekly averages penalizes the outcome, because skill does not only vary as a function of lead time but also averaging period. As a rule, prediction skill is better for longer averaging times (up to the lead time); the reason is twofold. First, extending the averaging interval implies that more of the skilful information from earlier stages of the integration contributes to the mean. For instance, the operational forecast to day 18 includes information only from days 12 to 18; however if the averages were taken over a 2-week interval, it would include information from day 5 onwards. Second, high-frequency, unpredictable noise is more effectively filtered out with longer averaging times.
Assessments of the monthly forecasting systems for precipitation have been less systematic than for temperature. There is, however, evidence that the quality of the forecasts is worse than for near-surface temperature. In general, the basic problem of current forecasting systems is their tendency to overestimate the number of rainy days but underestimate rainfall intensity (Ines & Hansen Reference Ines and Hansen2006). Work by Buizza et al. (Reference Buizza, Hollingsworth, Lalaurette and Ghelli1999) and Mullen & Buizza (Reference Mullen and Buizza2001, Reference Mullen and Buizza2002) suggests that for events characterized by moderate intensity, skilful predictions are possible for up to a week. However, accuracy decreases as the intensity threshold increases and for more intensive events forecasts show little skill, even for very short lead times.
Part of the problem is probably caused by the relatively low spatial resolution of the atmospheric component of current forecasting systems. Increasing the spatial resolution could, therefore, help to improve the quality of the forecasts, but only if this is not realized at the cost of a smaller ensemble size. Mullen & Buizza (Reference Mullen and Buizza2002) argued that low-resolution precipitation forecasts with large ensemble size could ultimately be more valuable to the end-users and decision-makers than high-resolution forecasts with small ensemble size, particularly in relation to heavy precipitation events.
QUALITY OF ENSEMBLE MONTHLY FORECASTS OVER SWITZERLAND
To better appreciate the forecasting skill of the ECMWF ensemble prediction system at the regional scale, monthly forecasts of near-surface temperature, precipitation and solar radiation for the area of the Swiss Central Plateau were examined. Solar radiation was included in the analysis as it is one of the main drivers of crop growth, but only seldom (if at all) considered in discussions of long-range probabilistic forecasts.
Prediction skill was assessed by comparing the so-called re-forecasts covering the period 1994–2005 to observations valid for the same period. As explained in the previous section, the re-forecasts were produced to provide a climatology for the operational monthly forecasts issued for the year 2006. All fields were interpolated to a 1×1° grid resolution. The analysis was performed for six representative sites, but only the results for a meteorological station on the Swiss Plateau (Wynau, 7°47′ E, 47°15′, 422 m asl) are presented in the current paper.
For this location, a substantial negative temperature bias was found (of the order of −3 K), and a positive bias both in relation to the occurrence of wet days (of the order of +0·5) as well as solar radiation (of the order of +0·1). The temperature bias should be considered in relation to the inaccurate representation of the Alpine topography in the forecasting system. It is not considered further because it can be effectively removed based on the altitude bias and the assumption of a standard lapse rate. Removing the precipitation bias is more delicate and will be discussed in the following sections.
Following the suggestions of Müller et al. (Reference Müller, Appenzeller, Doblas-Reyes and Liniger2005) and Weigel et al. (Reference Weigel, Liniger and Appenzeller2007, Reference Weigel, Baggenstos, Liniger, Vitart and Appenzeller2008) the de-biased ranked probability skill score was adopted as a measure of forecast quality. This method is not sensitive to the size of the ensemble, which in the present case varies considerably between forecasts (51 individual members) and re-forecasts (5 members). For all fields, mean values were computed depending on the lead time, taking averages over days 5–11, 5–18, 5–25 and 5–32 for weekly, bi-weekly, 3-weekly and monthly forecasts, respectively.
Results are presented in Table 1 and reveal several interesting features: (i) largely positive skill values up to a month are obtained for all variables, suggesting that monthly probabilistic forecasts are no worse than climatology even in a region characterized by complex topography and substantial biases in the model output; (ii) skill scores for precipitation and solar radiation are significantly lower than for near-surface temperature but still significantly larger than zero up to a lead time of c. 20 days; (iii) for temperature and solar radiation, but not for precipitation, skill scores calculated only over the 3 summer months (see values in brackets in the table) are better than those computed for the year as a whole, which is somewhat at odds with the findings of Weigel et al. (Reference Weigel, Baggenstos, Liniger, Vitart and Appenzeller2008) for the northern extra-tropics.
BIAS CORRECTION OF DAILY PRECIPITATION
In view of the importance of rainfall for crop production, the question of improving the quality of precipitation forecasts arises naturally in the context of agricultural decision problems. A simple multiplicative correction of the rainfall intensity provides unbiased estimates of monthly or seasonal precipitation amounts and is readily implemented in practice. However, it is inappropriate for studies that require information on a daily or weekly time scale.
As an alternative, Ines & Hansen (Reference Ines and Hansen2006) examined the possibility of applying a two-step correction directly to the daily output of an ensemble forecasting system. Essentially similar to the procedure proposed by Schmidli et al. (Reference Schmidli, Frei and Vidale2006), the approach involves (i) discarding rainfall events below a calibrated threshold to match the observed frequency of wet days and (ii) mapping the truncated distribution onto a gamma distribution fitted to the observed intensity distribution. Ines & Hansen (Reference Ines and Hansen2006) tested the procedure for a location in semi-arid Kenya, and discussed the implications for the simulation of maize growth. They concluded that while the procedure does effectively remove the bias in both frequency of wet days and rainfall intensity, it fails to account for the auto-correlation structure in the observed time series, with negative consequences for simulated maize yields.
As mentioned, the procedure could still be of practical use in a number of situations. Therefore, its performance was tested in relation to the forecasted precipitation over the Swiss Plateau. Specifically, the study examined whether the procedure has positive impact on skill scores and, if not, why not.
As before use was made of the ECMWF 1994–2005 monthly re-forecasts to evaluate the prediction skill, resorting for illustrative purposes to a skill score based on the mean square error without adjustment for the ensemble size (Murphy Reference Murphy1988, equation (15)). This has the advantage that the skill score can be decomposed in terms reflecting different aspects of the degree of agreement between forecasts and observations (Murphy Reference Murphy1988), namely: (i) the square of the correlation coefficient, i.e. a measure for the strength of the linear relation between forecasts and observations; (ii) a term related to the square of the slope of the regression line between forecast and observations; (iii) a term proportional to the square of the difference between the mean forecast and mean observation, i.e. a non-dimensional measure of the overall bias in the forecasts. Terms (ii) and (iii) have a negative impact on the skill score whenever the slope of the regression line between forecasts and observations significantly depart from 1 and the forecasts are biased.
For the Swiss Plateau, it was found that the correction procedure proposed by Ines & Hansen (Reference Ines and Hansen2006) had an overall negative impact on the skill score, with e.g. values for week 1 decreasing from 0·3 using uncorrected data to less than 0·2 using bias-corrected data. Decomposition of the skill score revealed that skill loss was mainly associated with a larger departure of the slope of the regression line between forecasts and observations from the ideal 1:1 line than found with the uncorrected data. In turn, this was prompted by changes in the statistical structure of the time series of daily values. Thus, not only is the procedure unable to recover the observed auto-correlation, as found by Ines & Hansen (Reference Ines and Hansen2006), but in some instances it can even lead to a deterioration of the statistical properties of the original forecasts.
Based on these findings, an alternative procedure was chosen for the post-processing of monthly forecasts. The approach is based on creating daily input data for application models with the help of a stochastic weather generator and is described in more detail in the next section.
FORECASTING SOIL WATER AVAILABILITY
The availability of soil water remains one of the main determinants of crop growth, particularly in rain-fed agriculture. For this reason, the possibility of predicting soil water availability up to a month in advance was studied, as an example, by linking the ECMWF monthly weather forecasts to a simple model of the soil water balance. In short, the model represents the root zone as a simple bucket, and computes changes in soil water storage in response to inputs from precipitation and outputs from evapotranspiration and deep percolation (see e.g. Rodriguez-Iturbe et al. Reference Rodriguez-Iturbe, Porporato, Ridolfi, Isham and Cox1999). Computation of the evaporative flux is carried out following Calanca (Reference Calanca2007), with potential evapotranspiration estimated using the Priestley–Taylor equation (Priestley & Taylor Reference Priestley and Taylor1972) and actual evapotranspiration limited according to soil water content.
Apart from the meteorological drivers, the model requires specification of the soil hydraulic properties. The parameters were calibrated by fitting model simulations to measurements of the soil water content from a field experiment running since 2002 at a location close to the study site (Ammann et al. Reference Ammann, Flechard, Leifeld, Neftel and Fuhrer2007). Results from the calibration showed that the model reproduces soil water dynamics reasonably well, including key features of the drought that accompanied the summer 2003 heat wave (length of dry spells, cumulated soil water deficit, etc.).
For the reasons detailed in the previous section, the probabilistic monthly forecasts were translated into daily realizations of temperature, precipitation and solar radiation consistent with the monthly forecasts with the help of a weather generator. This approach is common in climate studies (Wilks Reference Wilks2002; Hansen & Indeje Reference Hansen and Indeje2004; Feddersen & Andersen Reference Feddersen and Andersen2005; Lawless & Semenov Reference Lawless and Semenov2005). The generator used was the LARS-WG stochastic weather generator (Semenov & Barrow Reference Semenov and Barrow1997, Semenov et al. Reference Semenov, Brooks, Barrow and Richardson1998).
The generator was first conditioned with 25 years of daily weather observations and subsequently run to generate 3000 years of daily data consistent with the observed climatology. Then, 51 realizations were selected at random from this pool to reproduce the joint probability distributions of temperature, precipitation and solar radiation anomalies for each starting date and lead time indicated by the forecasts. These were then used to drive the soil moisture model over the forecasting period.
To account for the memory effect in soil water dynamics and the sensitivity of forecasted soil water to initial conditions, simulations were initialized by running the model from the beginning of each year up to each starting date using observed weather data (or by setting the initial soil water content to field capacity for the first forecasting period of the year).
An example of ensemble simulations obtained in this manner is shown in Fig. 1. Each panel refers to a different lead time, and predicted soil moisture evolution for each of the individual ensemble members is plotted on the background of a reference simulation driven with observed weather data and a 12-year climatology obtained from simulations for 1994–2005. The example refers to the year 2003, which was characterized by exceptionally high temperatures (Schär et al. Reference Schär, Vidale, Lüthi, Frei, Häberli, Liniger and Appenzeller2004), considerable water deficits over extended periods of time (Calanca Reference Calanca2007), and therefore a marked departure from the climatology.
Figure 1 illustrates two key features of the application. First, the dispersion of the ensemble (the spread of the forecast plume) is already considerable for a lead time of 2 weeks, and in some cases even for a lead time of 1 week. Second, even for lead times of 3 weeks, individual members tend to cluster around the reference rather than the climatology. However, systematic departures from the reference can be seen on occasions, suggesting that the forecasts do not always capture the observed evolution, not even in a probabilistic sense.
A similar analysis for all years between 1994 and 2005 confirms the existence of significant predictive skill (larger than or of the order of 0·2) for lead times of up to 1 month (Table 2). Compared to precipitation, this represents at least a 2-week extension of the predictability limit. However, owing to a weaker performance of the ECMWF forecasting system in relation to summer precipitation, skill scores for the summer seasons are slightly smaller than for the year as a whole. Moreover, Fig. 1 suggests that this particular application benefits to a large extent from the initialization procedure. It therefore appears that applications possessing some degree of inertia may partially overcome intrinsic weaknesses of long-range weather forecasts. In a practical context, this property could help with identifying the other types of applications.
DISCUSSION
In view of the persistent improvements in the field of weather forecasting, there is a real possibility that agriculture, as with other weather-sensitive human activities, could take advantage of the information provided by long-range forecasts. Efforts to deliver better forecasts have been undertaken along different lines of research, e.g. developments in physical parameterization schemes and multi-model ensemble forecasting (Palmer et al. Reference Palmer, Alessandri, Andersen, Cantelaube, Davey, Délécluse, Déqué, Díez, Doblas-Reyes, Feddersen, Graham, Gualdi, Guérémy, Hagedorn, Hoshen, Keenlyside, Latif, Lazar, Maisonnave, Marletto, Morse, Orfila, Rogel, Terres and Thomson2004) and also the re-calibration of single-model ensemble predictions (Weigel et al. Reference Weigel, Liniger and Appenzeller2009).
In spite of a promising background, with respect to European agriculture systematic efforts to foster the use of such forecasts still need to be undertaken. To some extent, this is understandable against the background of a limited skill of long-range forecasts, firstly in relation to precipitation. But limited predictability is also of concern in relation to other variables of interest in agrometeorology, such as solar radiation or air humidity. Little has been done to date with respect to these to systematically evaluate the performance of forecasting systems. Promoting targeted research in this direction could help to increase confidence in the forecasts.
In the present study, the quality of monthly precipitation forecasts were examined more closely using the example of the monthly ensemble forecasts issued by the ECMWF for Switzerland. It was found that extending the averaging time beyond the weekly scale at which the forecasts are issued could provide useful forecasts up to about 15 days. This is a sensitive time scale for decision problems involving, e.g. irrigation. Correction of the daily precipitation output, as suggested by Ines & Hansen (Reference Ines and Hansen2006), did not improve the prediction skill, since the procedure tends to destroy the auto-correlation structure and reduce skill scores.
Therefore, there is some evidence that processing the output of ensemble forecasting systems via statistical downscaling and/or stochastic weather generation remains, for the time being, the method of choice for retrieving information at the temporal and spatial scales required by application models (Wilks Reference Wilks2002). There have been considerable developments along these lines in recent years (see e.g. Feddersen & Andersen Reference Feddersen and Andersen2005), including, in particular, the generation of spatially coherent data (see e.g. Wilks Reference Wilks1998, Reference Wilks1999; Semenov & Brooks Reference Semenov and Brooks1999). However, further research is needed to develop the downscaling of variables other than near-surface temperature and precipitation (Huth Reference Huth2005). Additional efforts are also necessary to improve the downscaling of extreme events (Semenov Reference Semenov2008), regardless of progress achieved in the recent past (e.g. Busuioc et al. Reference Busuioc, Tomozeiu and Cacciamani2008; Hundecha & Bárdossy Reference Hundecha and Bárdossy2008).
From a practical point of view, a case study was presented where decisions based on monthly forecasts proved to have some benefits in predicting soil moisture availability up to a month ahead. Linking monthly forecasts with even a simple soil moisture model could thus serve as a starting point for developing a prediction system aimed at guiding irrigation. In presenting the results it was noted that in this case the decision process benefits from memory effects inherent to the system under consideration (the soil water store). It is not guaranteed that the same conclusion would have been reached, had the target of the investigation been another.
This shows the importance of evaluating the set-up of a decision support system relying on long-range forecasts on a case to case basis. Questions that need to be addressed are, among others, whether the forecasts are in an appropriate form, predict the proper variables and refer to the relevant time scales (Wilks Reference Wilks, Katz and Murphy1997; Garbrecht et al. Reference Garbrecht, Meinke, Sivakumar, Motha and Salinger2005). With regard to the last of these questions, Lawless & Semenov (Reference Lawless and Semenov2005) have shown that the lead time in prediction of crop yield does not only vary between locations but also depends on the crop characteristics affecting the decision process at a single location. In addition, sufficient care should be taken to fully exploit the probabilistic format of long-range forecasts (Doblas-Reyes et al. Reference Doblas-Reyes, Hagedorn and Palmer2007).
Examining the performance of decision support systems in the specific context of their application is also of paramount importance (Hansen et al. Reference Hansen, Challinor, Ines, Wheeler and Moron2006). Showing that forecasting systems provide a more reliable decision basis than approaches relying on experience and knowledge of climatology and/or persistence is one of the prerequisites for endorsing a positive attitude toward long-range forecasts by the end-users (McCrea et al. Reference McCrea, Dalgleish and Coventry2005). Opening the access to the forecasts, ensuring a better link between the climate and agricultural communities and fostering capacity-building activities are other measures that could help to promote the use of long-range forecasts (Garbrecht et al. Reference Garbrecht, Meinke, Sivakumar, Motha and Salinger2005). In particular, it is fundamental that end-users and decision-makers appreciate the difference between deterministic short- to medium-range weather forecasts and probabilistic monthly and seasonal forecasts.
Progress in long-range weather forecasting is important in the context of climate change as well, as timely information is needed for risk management and to devise effective measures of adaptation. The impacts of events such as the European heat wave in summer 2003, with uninsured losses for the agricultural sector estimated at US$12·4 billion (SwissRe 2004), can be dramatic. Climate scenarios suggest that in many parts of Europe events of this sort could occur more often in the future (Beniston Reference Beniston2004). Increasing the preparedness of farmers and other end-users is therefore essential to reduce the economic impacts of climate variability and limit the consequences of extreme events. It is encouraging to see that recent developments at ECMWF have already helped significantly improving the predictability of extreme European summer temperatures (Weisheimer et al. Reference Weisheimer, Doblas-Reyes and Palmer2009).
Part of this work has grown from activities of the lead author for the Commission on Agricultural Meteorology by the World Meteorological Organization (WMO) and is a contribution to COST Action 734 (Impact of Climate Change and Variability on European Agriculture, http://www.cost734.eu). The case study was supported by the Swiss National Science Foundation through the National Centre for Competence in Research on Climate (NCCR Climate). We thank Mikhail Semenov (Rothamsted Research, Centre for Mathematical and Computational Biology, Harpenden, Herts, AL5 2JQ, U.K.) for making available LARS-WG and also to two anonymous reviewers for useful comments.