INTRODUCTION
Noroviruses are a major cause of gastroenteritis [Reference Moreno-Espinosa, Farkas and Jiang1–Reference Glass, Parashar and Estes3]. There are an estimated 19–21 million norovirus cases and 56 000–71 000 hospitalizations in the United States per year [Reference Lopman4–Reference Hall6]. Noroviruses frequently result in outbreaks and cause ~50% of all epidemic gastroenteritis worldwide [Reference Widdowson, Monroe and Glass7]. Noroviruses are also the most common cause of foodborne disease outbreaks in the United States [Reference Hall8]. Outbreaks affect all age groups and commonly occur in nursing homes, hospital wards, daycare centres, schools, cruise ships and restaurants. The elderly, young children, travellers, and the immunocompromised are most vulnerable to higher incidence or severe outcomes from infection [Reference Glass, Parashar and Estes3]. With the decrease in severe rotavirus infections in children as a result of rotavirus vaccination, norovirus is now the most frequent cause of paediatric gastroenteritis requiring medical attention in the United States [Reference Payne9]. Globally, norovirus is estimated to account for around 18% of both community- or clinic-based gastroenteritis cases and emergency department- or hospital-based cases [Reference Patel10, Reference Ahmed11].
Despite the high frequency of outbreaks and the substantial public health burden, there are still important gaps in our understanding of fundamental aspects of norovirus infection and transmission dynamics. One gap is the dearth of studies based on mathematical and computational models that can be used to evaluate the effectiveness of specific interventions. Such modelling studies have been successfully applied to other infectious diseases such as measles, influenza, HIV, malaria, and tuberculosis [Reference Grassly and Fraser12–Reference Lavine, Poss and Grenfell15]. In anticipation of norovirus vaccines, which are currently in the pipeline [Reference Atmar16], the development of realistic mathematical models that can help to better understand transmission dynamics and the impact of interventions such as vaccination would be important.
Reliable mathematical models require accurate estimates for parameters governing the natural history of the disease and its infection and transmission dynamics. While such values are often reported in the literature, they are not always based on hard, quantitative and reliable estimates [Reference Reich17]. Systematic studies that determine estimates for parameters based on multiple sources of data are useful [Reference Lessler18–Reference Lee20]. For norovirus, accurate estimates for its incubation period have previously been established [Reference Lee20]. Here, we set out to also estimate the duration of the symptomatic period. In addition, we analyse whether the average incubation and symptomatic periods are associated with certain host, agent and environmental characteristics.
METHODS
Data collection
Data on the duration of the incubation and symptomatic periods of norovirus gastroenteritis were abstracted as reported in detail elsewhere [Reference Desai21, Reference Matthews22]. Briefly, published reports of human norovirus outbreaks with norovirus presence in stool confirmed with RT–PCR were systematically collected. Data from all of these outbreaks were abstracted according to as many as 74 different variables, including genotype, outbreak setting, suspected route of transmission, number of cases, and at-risk population size. Detailed descriptions of the dataset and our previous analyses are available in [Reference Desai21, Reference Matthews22]. We have since continued to update the dataset by adding further outbreaks following the same approach as described for the original dataset. The current number of catalogued outbreaks in our dataset is 1022; outbreak years are from 1983 to 2010. The dataset contains minimum, maximum, median, and mean values for the incubation and symptomatic periods for each outbreak where it was reported. These are the main outcomes of interest for this analysis. Most studies reported these periods in hours. When these periods were reported in days, we converted to hours by multiplying the periods in days by 24. This approximation likely leads to an unavoidable increase in values that are multiples of 12 h, since most studies reporting in days rounded to the closest half-day.
Data analysis
Confidence intervals (CIs) for incubation and symptomatic periods were computed through resampling of the data 100 000 times via non-parametric bootstrapping [Reference Efron and Tibshirani23]. For investigation of continuous predictors, we fitted linear models. For categorical predictors, we computed 95% CIs through non-parametric bootstrapping. The absence of overlap in the CIs provides a conservative measure of statistically significant differences [Reference Schenker and Gentleman24]. Visual inspection of the data suggested that tests based on a parametric assumption of normality could also be justified. We therefore also used parametric tests of significance, namely t tests (for two groups) or ANOVA (for multiple groups), which provide more sensitive measures compared to CIs of detecting potential differences between groups [Reference Schenker and Gentleman24]. As shown in the Results section, assessing the non-parametric CIs and the results from the direct parametric statistical analyses lead to very similar results. A final regression model with all predictors was also fit [Reference Hastie, Tibshirani and Friedman25]. All analyses were done in R version 3.0·1 [26] using additional functionality from the package boot.
RESULTS
Distribution of the incubation period
Of the 1022 outbreaks in our dataset, 73 reported a minimum value for the incubation period, 71 a maximum value, 48 a median and 21 a mean value (five reported both mean and median). We investigated the distribution of the reported values for the mean and median and found them to be rather similar (data not shown). We therefore decided to not distinguish between mean and median in subsequent analyses and pool them. We refer to those pooled values by the generic term ‘average’. For those five outbreaks where both mean and median were reported, we arbitrarily used the mean value. Our overall results do not change if the median is used instead (data not shown).
Figure 1a shows the distributions of minimum, maximum and mean/median values of the incubation period. For 51 outbreaks in our dataset, we had complete information for both minimum and maximum duration of the incubation period, as well as either mean or median of the incubation period. Values for those outbreaks are shown in Figure 1b .
The mean (95% CI) across all outbreaks for the average (mean or median) duration of the incubation period were 32·8 h (30·9–34·6 h). The median (95% CI) were 33·5 h (32·0 –34·0 h).
The mean (95% CI) across all outbreaks for the reported minimum duration of the incubation period were 14·9 h (12·8–17·0 h). The median (95% CI) were 14·5 h (12·0–18·0 h).
The mean (95% CI) across all outbreaks for the maximum duration of the incubation period were 57·2 h (52·2–62·9 h). The median (95% CI) were 54·0 h (48·0–60·0 h).
Distribution of the symptomatic period
Of the outbreaks in our dataset, 90 reported a minimum value for the symptomatic period, 90 a maximum value, 59 a median and 31 a mean value (three reported both mean and median). As done for the incubation period, we again pooled mean and median and if both values were present, we used the mean. For one outbreak, a value for the maximum duration of the symptomatic period of 1248 h (52 days) was reported several times larger than the second highest value of 384 h. We checked the original reference, which indeed reported 52 days as the upper range without further comment [Reference O'Reilly27]. While long-term symptoms and shedding have been reported [Reference Siebenga28] and are likely an important driver of transmission in some settings, we decided to treat this value as an outlier and remove it for the purpose of computing the mean and 95% CI of the maximum duration reported below.
Figure 1c shows the distributions of minimum, maximum and mean/median values of the symptomatic period. For 68 outbreaks in our dataset, we had complete information for minimum, maximum and either mean or median of the incubation period. Values for those outbreaks are shown in Figure 1d .
The mean (95% CI) across all outbreaks for the average (mean or median) duration of the incubation period were 44·7 h (39·0–52·1 h). The median (95% CI) were 43·0 h (36·0–48·0 h).
The mean (95% CI) across all outbreaks for the reported minimum duration of the incubation period were 16·9 h (14·1–19·8 h). The median (95% CI) were 17·0 h (11·0–24·0 h).
The mean (95% CI) across all outbreaks for the maximum duration of the incubation period were 130·4 h (114·0–147·9 h). The median (95% CI) were 105·5 h (96·0–127·5 h).
Association of incubation period with host, pathogen and environmental characteristics
For the 64 outbreaks for which we have information on either mean or median duration of the incubation period, we analysed whether the duration of the incubation period was associated with predictors of interest in our dataset. We found no statistically significant association with any of the following predictors: healthcare setting, food service setting, hemisphere, season of outbreak, virus genotype, presence of other pathogens, or mode of transmission (Table 1). Analysis of outbreaks that explicitly reported vomiting vs. those that did not mention vomiting suggested differences between incubation periods that were marginally significant (P = 0·04, Table 1). However, only five outbreaks did not report vomiting, and lack of reporting vomiting does not necessarily indicate that it did not occur. Therefore, this difference may not be meaningful. A linear regression analysis of the impact of average age on incubation period for those outbreaks that reported both variables (N = 32 outbreaks) suggested that there was no significant variability based on age (P = 0·467). A linear regression between the proportion of infected (i.e. attack ‘rate’) and incubation period also found no correlation (N = 51, P = 0·644). A final linear regression between the incubation period and all predictor variables was performed on the N = 24 entries for which information for all predictors was available. The only significant predictor for this analysis with all predictors included was age (P = 0·04), despite age not showing a statistically significant association in a univariate analysis.
CI, Confidence interval; n.a., not available.
* Indicates statistical significance at the <5% level for different categories within groups based on t test or ANOVA.
† If we assume that outbreaks for which no information was given correspond to absence of other pathogen, this category has N = 52 (incubation period) and N = 72 (symptomatic period), again no significant difference compared to presence of other pathogen.
‡ Main route of transmission if multiple routes were indicated.
§ t test between foodborne and environmental transmission was not significant. Person-to-person transmission not tested since only one value is available.
Background shading indicates grouping of predictor variables for statistical analysis.
Association of symptomatic period with host, pathogen and environmental characteristics
We repeated the analysis performed in the previous section for the 87 outbreaks for which information on either mean or median duration of the symptomatic period was available. We found no statistically significant association between the duration of the symptomatic period with reported vomiting, healthcare setting, food service setting, season of outbreak, virus genotype, or presence of other pathogens (Table 1). Grouping according to healthcare setting and food service setting showed noticeable differences, but those did not reach the 5% significance level. Similarly, outbreaks caused by the GII.4 strain had noticeably longer symptomatic periods, but again this did not reach statistical significance. A statistically significant association between main mode of transmission and duration of symptomatic period was found (P = 0·003). Foodborne transmission was associated with a shorter symptomatic period compared to other modes of transmission (Table 1). Hemisphere also showed a small statistically significant difference, with shorter symptomatic periods in the Southern hemisphere (P = 0·02, Table 1). A linear regression analysis of the impact of average age on symptomatic period for those outbreaks that report both quantities (N = 45 outbreaks) suggested that there was no significant variability based on age (P = 0·131). A linear regression between proportion of infected (i.e. attack ‘rate’) and symptomatic period also found no correlation (N = 72, P = 0·316). A final linear regression between the symptomatic period and all predictor variables was performed on the N = 26 entries for which information for all predictors was available. None of the predictors was significant.
DISCUSSION
Based on the analysis of an abstracted dataset of norovirus outbreaks, we reported the average of the incubation period for norovirus to be ~33 h. This is similar to another recent estimate of ~29 h [Reference Lee20]. Our slightly larger value might be due to the fact that we used summary (median or mean) values reported in the original articles, rather than individual-level data, so some of our data may have been skewed by outlying individual values. Moreover, we did not try to adjust for potential censoring and inexact reporting, as was done in [Reference Lee20]. It is reassuring that despite those differences, the estimates were rather close.
The average minimum and maximum duration of the incubation period across outbreaks was found to be about 15 h and 55 h. This is a somewhat tighter range than the values of about 10 h and 72 h at which 5% and 95% of individuals are estimated to report symptoms following infection [Reference Lee20]. Given the different methodology (individual patient data vs. comparison across outbreaks) and different values that were measured, complete agreement was not expected, but the values are again similar.
For the average symptomatic period, we reported an estimate of ~44 h. The average minimum and maximum duration of the incubation period across outbreaks was found to be about 17 h and 120 h, respectively. This rather wide range between the minimum and maximum symptomatic period highlights that many individuals recover quickly, while a small group may be ill for longer, ~5 days, based on our synthesis. Quantifying this range is important for detailed, individual-based computational models that consider the individual-level variation.
Mathematical or computational modelling studies need reliable estimates for important parameters such as the duration of the incubation and symptomatic periods (which often, but not always, can be assumed to coincide with the latent and infectious periods). Further, it is important to know how these parameters might depend on the situation to which the model is applied. For instance, influenza in children is longer than in adults [Reference Nicholson, Wood and Zambon29], therefore depending on which population a model describes, the parameters need to be chosen appropriately. There is some evidence that the duration of norovirus gastroenteritis is longer for young children [Reference Rockx30] and patients affected in healthcare outbreaks [Reference Lopman31]. Therefore, we investigated the variability of the incubation and symptomatic periods based on host, agent and environment characteristics. Our investigation of the variability of these quantities found that there was little difference between the average incubation or symptomatic periods and any of the host, pathogen and environmental characteristics we considered.
Only a few predictors lead to marginally statistically significant differences. For the incubation period, the reported presence or absence of vomiting was found to have a statistically significant association. However, only five outbreaks did not report vomiting, and we do not know if lack of such reporting properly indicated absence of vomiting. It is therefore unclear whether this statistically significant difference is meaningful.
While age did not show a significant association with duration of the incubation period in a univariate analysis, it was the only significant predictor in a multiple regression model, although the significance was marginal. This difference in result might be due to the fact that for the multiple regression analysis, we only included the subset of outbreaks for which information on all predictors was available (N = 24), while the univariate analysis of age was performed on 32 outbreaks. The finding that a shorter incubation period is associated with younger age is not surprising. The caveat to this finding, as to all of our results, is that the unit of analysis is not individual patients but outbreaks.
For the symptomatic period, an outbreak occurring in the Northern hemisphere was associated with a longer duration of symptoms; however, the significance was marginal and we cannot think of a biological reason that would support this statistical finding. Given that we performed a number of comparisons here, and did not use any multiple-corrections test, it is expected that a chance significant difference at the 5% level occasionally occurs.
Foodborne transmission was associated with a significant reduction in the symptomatic period compared to other routes of transmission in a univariate analysis. However, this significance was not found in a multi-predictor analysis.
We did observe a longer, but non-significant, duration of the symptomatic period in vulnerable populations (i.e. those affected in healthcare settings). While this did not reach statistical significance for our dataset, it agrees with previous findings [Reference Lopman31] and might be worth further investigation.
Overall, we interpret our analysis of host, pathogen and environment factors to indicate that the average duration of incubation and symptomatic period is rather robust and varies little with changes in host, pathogen and environmental conditions.
The current analysis has several limitations. The most important is the fact that our unit of analysis is individual outbreaks and not individual persons. This means ecological fallacies might be present. For our particular study, it means that potentially existing associations between the predictors and outcomes we analysed might exist at the individual person level, but we were not able to detect them with an analysis that uses outbreaks as unit of analysis.
Further, our data might be biased owing to the fact that all outbreaks we analysed were published in the literature. This was clearly not a representative sample of all norovirus outbreaks, as evidenced by the fact that outbreaks in healthcare settings are a minority of the reported outbreaks, even though this is known to be the most common setting [Reference Hall8, Reference Lopman32]. Given that we did not find differences according to outbreak setting, this might not have biased the overall results. However, our sample size for some of the settings was small, so a definite conclusion cannot be drawn. Furthermore, only a fraction of the outbreaks reported information on the incubation and symptomatic periods. It could be that studies reporting such information are not representative of the whole dataset.
Another caveat to the results comes from potential rounding in the original studies. For instance while values for the incubation period duration were usually reported in hours, often the numbers appeared as though they were rounded. Specifically multiples of 12 h (e.g. values of 24, 36, 48 h) seemed to occur frequently, suggesting potential rounding to those numbers by the authors of the original reports. While this rounding likely occurred in a random fashion and is therefore unlikely to bias our estimates, it leads to some additional uncertainty in the precision of the estimates.
Despite these limitations, we believe that the estimation of the incubation and symptomatic periods and their ranges done here is a useful contribution towards our understanding the dynamics of norovirus transmission and will be a useful input for future norovirus transmission models. Such models can be used to evaluate the potential of intervention strategies, including vaccination.
ACKNOWLEDGEMENTS
J.S.L. was partially supported by the National Institute of Allergy and Infectious Diseases at the National Institutes of Health (grant 1K01AI087724-01), the National Institute of Food and Agriculture at the U.S. Department of Agriculture (grant 2010-85212-20608), and the Emory University Global Health Institute. The findings and conclusions in this paper are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention.
DECLARATION OF INTEREST
None.