INTRODUCTION
Legionellosis refers to a number of diseases caused by infection with bacteria of the Legionellaceae family. Although more than 50 Legionella spp. have been identified, Legionella pneumophila is the cause of most Legionnaires' disease (LD) cases [Reference Diederen1]. The disease onset is usually 2–14 days after infection with the bacteria [2] and may cause serious morbidity with pneumonia, fever, nausea, abdominal pain, diarrhoea, vomit and bradycardia. Neurological symptoms, although rare, include headache, lethargy and encephalopathy.
The first described outbreak of LD took place in 1965 in patients of a psychiatric hospital in Washington, DC, and had a documented case fatality of 17·3%. However, it was not until 1976, after a fourth outbreak of LD had occurred in the United States, that the bacterium was isolated and named [Reference Thackers3]. In The Netherlands, the first documented large LD outbreak took place in 1999 after common exposure to a contaminated spa pool at a flower exhibition. There were 188 confirmed and probable L. pneumophila pneumonia cases with 21 deaths (case fatality 11·2%) [Reference Boer4]. In 2006, a second outbreak of LD involving 31 cases with three deaths was notified in Amsterdam, after airborne transmission of the bacteria from a contaminated cooling tower [Reference v.d. Hoek, Ijzerman and Coutinho5, Reference v.d. Sande6].
Following this located point source outbreak, there was a considerable rise in sporadic LD cases in late summer that could not be attributed to a common source or changes in the reporting system. This increase in sporadic cases followed a record hot July and a record wet August. A similar generalized outbreak was observed in the United Kingdom [7, Reference Joseph and v.d. Sande8], but not in neighbouring Belgium and Germany [Reference Swaan9]. Nevertheless, it was hypothesized that ‘something to do with the weather’ could be responsible for this increase in LD cases. This study attempts to investigate short-term effects of the weather on the incidence of sporadic LD cases in The Netherlands, which might contribute to our understanding of the determinants of this unexpected increase in notified cases in 2006. Furthermore, this study may prove useful to predict similar increases in the future and could, therefore, support early warning for the risk of future outbreaks.
MATERIALS AND METHODS
The data used in the present analysis are from the national surveillance database (OSIRIS) and include any person that was notified with LD between 1 July 2003 and 30 September 2007. The infection date was calculated from the date of symptoms onset by deducting 5 days, the median incubation period of LD. All dates were then transformed to weeks and the analysis was based on weekly observations in order to avoid dealing with zero counts and reduce random error. Patients who had travelled outside The Netherlands during the incubation period were not included in this study. The analysis was restricted to the warmest period of the year (April–September) because the fewer LD cases in the cold period could influence the results disproportionately. This period is often quoted as the warmest part of the year in temperate climates [Reference Michelozzi10] and roughly coincides with weeks 18–39. Moreover, the two seasons are characterized by different atmospheric circulation patterns, which is also in favour of a separate analysis; we were interested in understanding what determines the different epidemiological pattern in different summers, rather than increase our understanding on the seasonality of LD. Cases that were identified as being part of a known cluster of LD cases were also excluded from this study; only the first cases in time (‘index cases’) were left in the analysis.
Meteorological data were collected from the weather station of the Royal Dutch Meteorological Institute (KNMI) in De Bilt. The meteorological variables that were included in this analysis were mean weekly temperature, weekly sunshine duration, mean weekly cloudiness, mean weekly wind speed, weekly precipitation, mean weekly rainfall intensity and mean weekly relative humidity (RH).
Analysis with weather variables
We investigated the associations between the weather and LD incidence by extending Poisson regression generalized linear models (GLMs) with loess smoothers [Reference Hastie and Tibshirani11, Reference Schwartz12]. The extended model allows the inclusion of non-parametric smooth functions to model the potential nonlinear dependence of morbidity from LD on weather variables and the season. The assumption of this model is that
where Y is the weekly number of LD cases, E(Y) is the expected value of this count, X i (i=1, 2, …, p) are the covariates and S is a smoothing function. To control for long-term seasonality, we used loess smoothing for the LD count data [Reference Cleveland and Devlin13]. The equivalent smoothing window was chosen with the help of partial autocorrelation plots and plots of residuals. We had decided in advance that the smoothing window should not be less than 2 months in order to avoid eliminating underlying short-term patterns, as has been suggested by other authors [Reference Kassomenos14, Reference Kassomenos, Gryparis and Katsouyanni15]. The smoothing window that minimized partial autocorrelation in the residuals after loess smoothing was 28 weeks.
To allow for the assessment of nonlinear associations between weather variables and LD incidence, we also performed quadratic transformation of the weather variables. The contribution of all variables in the Poisson regression models was assessed with the use of Wald tests.
Analysis with weather types
Principal components analysis (PCA), a factor analysis technique that rewrites the original data matrix into a new set of components that are linearly independent and ordered by the amount of variance they explain was used in order to limit the number of the used variables [Reference Fisman16]. To do so, component loadings were calculated, which expressed the correlation between the original variables and the newly formed components. Each week was then expressed by its particular set of ‘component scores’, which are weighted summed values, the magnitudes of which depended on the weather observations for each week and the principal component loading. Thus, weeks with similar meteorological conditions tend to exhibit proximate component scores. At the same time, the use of too many independent variables is avoided; the new components reflect the synergy of the initial variables. We then used a clustering procedure to group weeks with similar component scores into the same categories of weeks that are meteorologically more homogeneous. The method used here was the average linkage method, which is considered the most efficient method in clustering meteorological variables [Reference Kalkstein, Tan and Skindlov17]. The mean weekly LD incidence was then calculated for every synoptic (weather) category, along with its standard deviation, to ascertain the distribution of LD incidence by synoptic category.
Missing values management
For six cases, the date of symptom onset was unknown. To estimate that date for those patients, we calculated the median lag between date of symptom onset and date of report in OSIRIS for the rest of the patients. That lag, namely 5 days, was then deducted from the report date of the six patients with unknown symptom onset date. Then, as for the rest of the cases, an extra 5 days were deducted from their estimated date of symptoms onset to calculate the most probable date of exposure and infection.
RESULTS
In total, 707 cases of LD with The Netherlands as the most probable country of infection were notified to RIVM through OSIRIS between 1 July 2003 and 30 September 2007. Of these cases, 432 (61·1%) had been infected during the warm period of the year (weeks 18–39). Table 1 presents the basic characteristics of the weather during the two distinct periods at the KNMI weather station in De Bilt, as well as the weekly number of LD cases in The Netherlands. The warm period is drier and 9·3°C warmer than the cold period and, even though total precipitation does not differ statistically significantly between the two periods, it tends to fall more intensely in the warm period (P<0·0001).
Sources: The Royal Dutch Meteorological Institute (KNMI); the national surveillance database (OSIRIS).
Weather variables
Univariate analysis
The weekly incidence of LD appears to maximize when the mean weekly temperature is +17·5°C (P<0·001). Higher temperatures in the 2 weeks preceding exposure further contribute to a higher LD incidence (P=0·002 and P=0·004 respectively). An increase in weekly sunshine by 1 h results in an increased LD incidence of 1·8% (95% CI 1·2–2·4). Similarly, weeks with increased cloud cover are related with the highest LD incidence; more cases of LD occur when the cloudiness is 7 oktas, i.e. when the mean weekly cloud cover is seven-eighths. RH is also associated with LD incidence: a 1% increase in RH is associated with an increased LD incidence of 6·4% (95% CI 4·7–8·2). LD occurrence is highest when average weekly precipitation intensity is 3 mm/h (P=0·004). Last, LD incidence is highest when weekly precipitation is between 40 mm and 60 mm. Univariate analysis results are given in Table 2.
n.a., Not available.
* Mean weekly temperature has a quadratic association with the weekly incidence of Legionnaires' disease in the multivariable model.
Multivariable analysis
The multivariable model suggests that, adjusting for long-term trends, mean weekly precipitation intensity, mean weekly temperature and mean RH contribute to the explanation of LD incidence in The Netherlands. Mean weekly temperature has a quadratic association with LD incidence. An increase in the mean weekly precipitation intensity by 1 mm/h results in an increased LD incidence of 14·8% (95% CI 4·2–26·6), while higher weekly values of mean RH are also associated with higher LD incidence; a 1% increase in mean weekly RH is associated with a higher LD incidence of 5·1% (CI 95% 2·9–7·5). There was no statistically significant interaction between the variables that were included in the final multivariable analysis model.
The multivariable analysis model, based purely on meteorological data and long-term trends of LD incidence, was able to explain 43·3% of the variance in the epidemiological data. Actual and predicted weekly case counts are presented in Figure 1. The variables contributing to the multivariable analysis model are presented in Table 2.
Weather types
Factor analysis limited the number of used variables to three components that could still explain 76·4% of the variability existing among all weather variables initially used in the analysis. Ten major synoptic weather classifications were identified using the clustering procedure (Table 3).
n.a., Not available.
* Three weeks not categorized due to sparse data.
† Only 1 day was classified in category 9.
The synoptic category with the highest LD incidence is category 10; this category includes weeks with a mean temperature close to the overall mean for weeks 18–39 at the De Bilt station. The average RH in that category is 88·0% and the sun shines for 24·9 h/week (an average of 3·6 h/day). On the contrary, weather categories 5 and 6 are linked with the fewest LD cases per week; the weather is then sunnier with 37·5 h and 47·6 h of sunshine per week respectively. Category 7 has an average of zero LD cases per week, but this category was encountered only twice in the sample. The hottest summer weeks, represented in category 2 (mean weekly temperature 23·2°C) do not coincide with the highest LD incidence in the country.
DISCUSSION
Two major outbreaks of LD have been described in The Netherlands since 1999, increasing interest in this disease. In 2006, a record number of LD cases were notified throughout the country; most LD cases reported that year were sporadic and could not be assigned to any clearly defined outbreaks.
The results published in this study show a possibly direct role of the natural environment in the epidemiology of Legionella. Our time-series methodology provides evidence that LD incidence in The Netherlands is highest when the weather is warm and wet during the summer. Very hot days, although they do not occur too often, do not coincide with the highest incidence of LD throughout the year. Even though disease occurrence can be confounded by factors such as different population behaviour depending on the weather, our study attempts to control for long-term and seasonal patterns, focusing on short-term effects of the weather on LD incidence only.
The two methods used to analyse the data gave comparable results. In the first analysis, where GLMs were used, gloomy and wet summer weather was shown to be associated with the weeks with the highest LD morbidity in The Netherlands. Extensive cloud cover, low sunshine, mean temperatures around 17·5°C, high RH readings and intense precipitation were independently found to be linked to the highest LD incidences. In the multivariable model, temperature, precipitation intensity and RH could explain 43·3% of total LD incidence variability. The clustering technique, on the other hand, showed that very humid and wet conditions with little sunshine and close to average temperatures are associated with the highest LD incidence.
The use of the PCA analysis, which resulted in several distinct weather types, can prove easier to interpret compared to the results of the Poisson regression analysis. Mean weekly temperature has, for instance, a quadratic association with the incidence of LD in The Netherlands. Even though this finding allowed us to see that LD incidence is maximized for mean weekly temperatures of 17·5°C and that the effect of the latter is not linear, incidence rate ratios (IRRs) cannot be interpreted easily. On the other hand, the categorization of periods of time into well-defined weather types – such as, indeed, ‘warm, wet weather’ – can facilitate the understanding of the effect of weather conditions on the incidence of LD.
Some of the weather variables that appear in the univariate analysis results fail to be statistically significant in the multivariable model. This can be explained through collinearity between some of the weather variables; low cloud cover is, for instance, associated with little or no precipitation and high sunshine totals. Rainy days are, on the other hand, more humid than dry ones. Hence, the selection of some variables in the model automatically excludes other variables from being included the model.
Regarding precipitation and RH, our results show that warm and wet weather patterns, but not the hottest ones, are associated with more LD cases. This finding is consistent with the results found by Fisman et al. [Reference Fisman16], although the subject was approached with a different methodology. Our findings also correspond to the ecological profile of Legionella [Reference Stout, Yu and Best18], a principally aquatic microorganism.
The present study has some limitations. Data on the most likely country of infection rely wholly on the answers given by patients, so misclassification is possible to some extent. Moreover, the definition of most likely country of infection used here was the same as in the national surveillance system; that means that patients who had been outside the country during part of the incubation period were classified as having acquired their infection abroad, which may be untrue for some of them. However, random misclassification of this exposure would lessen the strength of the associations we found.
A second limitation of this study is the source of meteorological data, which are derived from the De Bilt weather station. It may be preferable to select data from different stations around the country to allow for better estimation of the individual exposure of LD cases to the weather conditions. However, De Bilt is in the middle of the country and shares some of the maritime climate features of the west and north of the country and some of the more continental elements of the climate in the south and east of the country; for these reasons, it can be considered representative of The Netherlands [Reference Verbeek19].
Even though disease occurrence can be confounded by factors such as different population behaviour depending on the weather, our study attempts to control for long-term and seasonal patterns, focusing on short-term effects of the weather on LD incidence only.
We have deliberately chosen to exclude from our analysis all LD cases that were part of a cluster, except for the index cases, i.e. the cases with the earliest symptom onset date per cluster. Each cluster is seen as an independent incident of human exposure to L. pneumophila. However, the purpose of the present study was to explore the influence of the weather on the transmission of the pathogenic organism from the environment to the human population. The investigation of how weather influences the transmission dynamics within a cluster of LD cases was beyond our study objectives.
The results of the present study come to some agreement with the preliminary results of Ricketts et al., who, through a case-crossover approach, suggest that there could be an association between the incidence of LD and RH [Reference Rickets20]. Understanding the influence of weather on the incidence of LD could help clarify the underlying mechanism that resulted in such an increased number of sporadic cases of LD in The Netherlands in 2006 and could help predict the impact of new outbreaks. Blatny et al. suggested that the highest concentrations of L. pneumophila in air samples close to a known source in Norway were measured during cloudy and not very hot days [Reference Blatny21].
Specific weather variables can be used to better understand the underlying association between the weather patterns and the incidence of LD. The possible transition towards a warmer climate with fewer days with precipitation but with more abrupt changes and heavier rainfall, as described in other science fields, could mean that the epidemiological profile of some diseases are affected by the weather changes [Reference Campbell-Lendrum, Corvalán and Neira22]. However, the underlying biological mechanisms of the associations between weather and LD incidence still remain unexplained. Additional research in this field could help provide more evidence for the biological plausibility of ‘warm, wet weather’ being associated with more LD cases.
ACKNOWLEDGEMENTS
The authors thank Barbara Schimmer and Susan Hahné, epidemiologists at the Department of Epidemiology and Surveillance of the National Institute for Public Health and the Environment (RIVM), and Esther Kissling, EPIET fellow at the Institute of Public Health in Brussels, Belgium, for their support and provision of useful comments throughout the study.
DECLARATION OF INTEREST
None.