Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-13T21:37:07.026Z Has data issue: false hasContentIssue false

Early detection of local SARS-CoV-2 outbreaks by wastewater surveillance: a feasibility study

Published online by Cambridge University Press:  01 February 2023

Maarten Nauta*
Affiliation:
Department of Infectious Disease Epidemiology & Prevention, Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark
Oliver McManus
Affiliation:
Department of Infectious Disease Epidemiology & Prevention, Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark European Programme for Public Health Microbiology Training (EUPHEM), European Centre for Disease Prevention and Control (ECDC), Gustav III:s Boulevard 40, 16973 Solna, Sweden
Kristina Træholt Franck
Affiliation:
Department of Virus & Microbiological Special Diagnostics, Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark
Ellinor Lindberg Marving
Affiliation:
Department of Virus & Microbiological Special Diagnostics, Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark
Lasse Dam Rasmussen
Affiliation:
Department of Virus & Microbiological Special Diagnostics, Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark
Stine Raith Richter
Affiliation:
Department of Virus & Microbiological Special Diagnostics, Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark
Steen Ethelberg
Affiliation:
Department of Infectious Disease Epidemiology & Prevention, Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark Department of Public Health, Global Health Section, University of Copenhagen, Øster Farimagsgade 5, 1014 København K, Denmark
*
Author for correspondence: Maarten Nauta, E-mail: mjna@ssi.dk
Rights & Permissions [Opens in a new window]

Abstract

Wastewater surveillance and quantitative analysis of SARS-CoV-2 RNA are increasingly used to monitor the spread of COVID-19 in the community. We studied the feasibility of applying the surveillance data for early detection of local outbreaks. A Monte Carlo simulation model was constructed, applying data on reported variation in RNA gene copy concentration in faeces and faecal masses shed. It showed that, even with a constant number of SARS-CoV-2 RNA shedders, the variation in concentrations found in wastewater samples will be large, and that it will be challenging to translate viral concentrations into incidence estimates, especially when the number of shedders is low. Potential signals for early detection of hypothetical outbreaks were analysed for their performance in terms of sensitivity and specificity of the signals. The results suggest that a sudden increase in incidence is not easily identified on the basis of wastewater surveillance data, especially in small sampling areas and in low-incidence situations. However, with a high number of shedders and when combining data from multiple consecutive tests, the performance of wastewater sampling is expected to improve considerably. The developed modelling approach can increase our understanding of the results from wastewater surveillance of SARS-CoV-2.

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Introduction

Worldwide, wastewater-based epidemiology (WBE) is increasingly used as a tool to monitor the spread of COVID-19 in the community. The method has proven to be successful in describing epidemiological trends by identifying and quantifying the virus RNA in wastewater samples [Reference Ahmed1Reference Medema12]. Additionally, as increasing trend in wastewater may be found prior to those identified by individual testing, it is proposed to be useful for early warning [Reference Medema13Reference Wang17]. Especially when human testing is limited, it has the potential to predict an increase in the hospitalisation rate, allowing for rapid intervention against local spread of the virus [Reference Saguti18].

Before the start of the COVID-19 pandemic, WBE has successfully been applied for several purposes, including surveillance for poliovirus [Reference Patel19]. In a situation where the virus is not circulating in the population, such as in the case of the poliovirus, emergence of the virus can be signalled by wastewater sampling, and wastewater surveillance has proven to be a useful strategy for early detection and intervention [Reference Asghar20]. This is particularly useful when samples are obtained at a local scale, so immediate and targeted action can be taken [Reference Prado6, Reference Saguti18]. Research at local or institutional scale (such as in university dormitories) shows that the application of wastewater surveillance is a useful strategy in a situation where re-emergence of SARS-CoV-2 has to be detected in an early stage [Reference Harris-Lovett21, Reference Gibas22]. However, in the current situation where the epidemic is ongoing and is expected to develop as an endemic disease [Reference Phillips23, Reference Zamir24], such elimination and re-emergence may not be a realistic scenario in catchment areas that cover populations with thousands of people or more. In a low-incidence situation, it may be more important to detect increases in incidence that indicate the start of a local outbreak, which should be targeted for a local intervention, before the outbreak spreads further.

For the analysis of wastewater sampling data, it is important to understand the relationship between the number of gene copies (as found by qPCR) and the number of infected people in the population. It has been estimated that between 40 and 67% of infected people shed the SARS-CoV-2 virus in their faeces [Reference Parasa25Reference Xiao27], but the timing of faecal shedding remains largely unknown. Using a Monte Carlo simulation model describing the relation between the infection prevalence and the total number of SARS-CoV-2 RNA copies in wastewater, Ahmed et al. [Reference Ahmed1] estimated the number of infections based on Australian wastewater data. Medema et al. [Reference Medema12] established a similar theoretical relationship between the number of shedders and the expected virus concentration in wastewater. In their Monte Carlo simulation, they found that the uncertainty of the virus concentration estimated from the number of shedders is dominated by the variation in viral concentrations between people and is particularly large with low numbers of shedders. Whereas Medema et al. [Reference Medema12] assumed no decay of the RNA in the sewer signal, McMahan et al. [Reference McMahan28] incorporated the effects of viral decay over time in their model for viral concentrations in the sewer shed, and applied it in a susceptible-exposed-infectious-recovered model to describe the course of the epidemic.

In Denmark, wastewater surveillance for SARS-CoV-2 has been set up in the course of 2021, and has been organised with samples taken three times a week at 230 locations covering more than 85% of Danish addresses. Population sizes of the catchment areas range between 670 and 638 000 inhabitants (median 10 600). This specific surveillance programme has prompted the need to assess the possible application of wastewater surveillance for early detection of local outbreaks in a low-prevalence situation, which would allow fast local intervention to prevent a wider spread of the virus. To our knowledge, the feasibility of this specific application of wastewater surveillance data has not been studied previously.

In this paper, we therefore explored the possibility for setting up an early warning system for local SARS-CoV-2 outbreaks in a low-prevalence situation, by developing a Monte Carlo simulation model of viral shedding in a wastewater system and analysis of wastewater sampling. As outbreaks are characterised by an increase in the number of infected people, the aim of the modelling was to evaluate the performance of different potential signals for an increase in incidence, based on measured RNA copy counts in two consecutive periods. The results of the modelling may pave the way for the implementation of a systematic routine calculation of potential signals from a wastewater surveillance programme. Although the Danish surveillance programme inspired our study, we used a generic approach for which the conclusions should be valid internationally.

Methods

Model description

Based on [Reference Medema12], the relation between RNA copy concentration in wastewater and the number of people shedding the virus in the wastewater system (shedders) can be given as:

(1)$$C_{ww} = \displaystyle{{\mathop \sum \nolimits_{i = 1}^N fl_iC_{\,faeces, \;i}} \over Q}$$

where Cww is the concentration in the wastewater (gene copies (gc))/l), N is the number of infected people shedding the virus, fli is the amount of faeces shed by one infected individual i (g faeces per person per day), Cfaeces,i is the number of gene copies per gram faecal matter shed by infected individual i (gc/g faeces) and Q indicates the daily water flow to the sewer (l per day).

It follows that the change in log concentration in the wastewater over two measurements is

(2)$$\Delta \log ( {C_{ww}} ) = \Delta \;{\rm log}( \mathop \sum \limits_{i = 1}^N fl_iC_{\,faeces, \;i}) -\Delta {\rm log}( Q ) $$

which simplifies to:

(3)$$\Delta \log ( {C_{ww}} ) = \Delta \;{\rm log}( \mathop \sum \limits_{i = 1}^N fl_iC_{\,faeces, \;i}) $$

if Q does not change between measurements at a specific sampling point.

Given that, by definition,

(4)$$E( \mathop \sum \limits_{i = 1}^N fl_iC_{\,faeces, \;i}) = N\;\times E( {\,fl_i) \;\times E( C_{\,faeces, \;i}} ) $$

it follows that the relation between the change in log concentration of gene copies in the wastewater, Δlog (Cww), is expected to be proportional to the change in the log number of shedders, Δ log (N), for large values of N. Yet, if the variation in Cfaeces,i and fli between shedders is large and N is small, this assumption of proportionality may not be justified. Therefore, published data on Cfaeces and fl were compared to obtain feasible distributions for these variables in the model.

As qPCR analyses are not perfect, an additional source of variation is added to the values of log (Cww). This is implemented as ɛ ~ Normal(0, s PCR), where it is assumed that s PCR = 0.15 log10 units, equivalent to 0.5 Ct value in the qPCR [Reference Karlen29, Reference Forootan30].

Therefore, the change in concentration in the wastewater can be obtained from

(5)$$\Delta \log ( {C_{ww}} ) = \Delta \;{\rm log}( \mathop \sum \limits_{i = 1}^N fl_iC_{\,faeces, \;i}) + \varepsilon $$

This equation was implemented in a Monte Carlo simulation model, developed in R 4.0.4., where N values of fli and Cfaeces,i are sampled from the distributions given above (see Supplementary Material). The model was used to illustrate the expected dynamics in the observed values of Cww and to explore the expected performance of potential signals that can be used to identify an increase in incidence. The incidence is assumed to be proportional to the number of infected people shedding the virus (N). In the simulations, viral concentrations in human faeces and the amount of faeces shed are assumed to be independent from each other and independent by time. Also, the catchment size is not explicitly included in the model; the wastewater samples are assumed to be taken from well-homogenised wastewater.

Potential signals

The model was used to assess how wastewater surveillance data may be used to signal a twofold, fourfold or tenfold increase in incidence between two consecutive periods. Based on the model, we explored two potential signals: (a) the difference in the mean of the log Cww found between two sets of consecutive samples; (b) the P value of a linear regression through two sets of consecutive samples.

In (a), the mean of k = 3 consecutive samples (1 week in the Danish surveillance programme) is compared with the mean of the next k = 3 consecutive samples (the next week). The difference between the two means of the log(Cww) is determined, and evaluated as a potential signal defined as an increase of more than D log units. As a twofold increase implies an increase of 0.3 logs, we evaluated D = 0, 0.3, 0.6, 0.9 and 1.2. The same analysis is done for k = 6, which may refer to the comparison of two consecutive 2-week periods.

In (b), assuming k = 3 consecutive samples per week, a linear regression is performed through the 2 × 3 = 6 data points expressed as log concentrations. The P value associated with the slope of an increasing regression line being different from zero, which readily follows from the analysis, is used as a potential signal, defined as P < 0.05, P < 0.1 or P < 0.2. The same analysis is done for k = 6, which corresponds to the comparison of two consecutive 2-week periods.

Performance was expressed as sensitivity and specificity of the signal, as obtained from the simulations. Here, the sensitivity is the expected relative frequency in which you get a signal given an increase of the number of shedders between the sets of samples. The specificity is the expected relative frequency in which you do not get a signal given that the number of shedders is unchanged. In the simulations, it was assumed that the number of shedders instantaneously increased from one week to the next, i.e. N shedders for the first k data points and 2N, 4N or 10N for the second k data points. This is a hypothetical scenario that should be identified by a potential signal.

Results

Model inputs

Tables 1 and 2 show values and distributions that have been reported for the concentration of viral RNA copies in the faeces, Cfaeces and the faecal mass shed per day, fl. They show that most authors have used the data presented by [Reference Wölfel31] (on Cfaeces) and [Reference Rose32] (on fl). Based on these data, we use the following lognormal distribution for both parameters, as baseline in our analyses:

(3)$${\rm lo}{\rm g}_{ 10}( fl_i{\rm ) \sim Normal}( { 2{\rm .11, \;0} .25} ) $$
(4)$${\rm lo}{\rm g}_{ 10}( {C_{\,faeces, i}} ) _{}{\rm \sim Normal}( {{\rm 6, \;\ 1}} ) $$

Table 1. Reported distributions of the concentration of RNA copies (gene copies, gc) in human faeces (Cfaeces)

LoD, limit of detection.

Table 2. Reported distributions of the daily faecal mass shed by humans (fl)

These distributions are in line with what has been used by others, and have the advantage that the log10 values are normally distributed, so some basic statistics apply. Note that the precise mean values, 2.11 and 6, are not important for our approach, as we focus on the change in log concentrations in the wastewater Δlog Cww. Critical values for our analyses are the standard deviations as given in the equations above.

Simulation of the number of shedders and the concentration in the wastewater

First, the simulation model was run to explore the variation in Cww as a function of N. Figure 1a presents an example of a simulation of a series of 40 measurements over time where the number of shedders is held constant. This illustrates that, on average, the gene copy counts will be higher with a larger number of shedders, and also that the variation in gene copy counts will be substantial. With 100 000 iterations of the simulation model, for N = 3, 30 and 300 shedders, means in log (C ww) are 9.1, 10.6 and 11.8 and standard deviations 0.74, 0.39 and 0.24, respectively. The 95% probability intervals obtained from the 2.5 and 97.5 percentiles are 7.7−10.6, 10.0–11.5 and 11.3–12.3, respectively. Hence, the feasible ranges of observed concentrations overlap, despite the tenfold differences in numbers of shedder. This suggests that individual measurements are unreliable as indicators for an increase in the number of shedders, especially when the number of shedders is low. Figure 1b shows that the mean of log (Cww) of three samples performs better as an indicator of the number of shedders. The mean log (Cww) values are the same and the variation is still considerable (standard deviations 0.43, 0.23 and 0.14 with N = 3, 30 and 300 shedders respectively), especially when the number of shedders is low. However, the 95% probability intervals (8.3–10.0, 10.2–11.1 and 11.5–12.1 respectively) do no longer overlap.

Fig. 1. Example of the variation in the observed viral concentration in wastewater Cww (log gene copies per litre per day) due to random variation in the shedding of virus RNA in a simulation with N = 3 (circles), N = 30 (crosses) and N = 300 (triangles) shedders. (a) Forty consecutive single samples. (b) Consecutive means of independent sets of three samples. The horizontal axis can be taken to represent time, for example, daily independent measurements.

Figure 2 illustrates the relation that was obtained between log (N) and log (Cww) and is very similar to one published by [Reference Medema12]. It confirms that the variation between measurements is expected to be large. It also shows that the relation between the mean values of log (N) and log (Cww ) is not linear when the number of shedders is low, due to the nature of the lognormal distribution [Reference Fenton33].

Fig. 2. The simulated relation between the number of shedders N and the gene copy concentration in the wastewater Cww (median, 5% and 95% percentiles). Note that both are expressed on a log scale.

Next, the simulation model was used to explore the performance of potential signals by analysis of the frequency of signals without a change in the number of shedders N, and with a twofold, fourfold and tenfold increase of N. Results are presented in Figure 3, which shows that the performance for all signals is poor for the detection of a twofold increase in N, but progressively better for the detection of a fourfold and tenfold increase, especially when two 2-week periods (k = 6) are compared. With a fourfold increase, the best performance is from the signal D > 0.3log, with initially N = 1000 shedders. For a tenfold increase, the D > 0.6 signal performs best, with (almost) 100% sensitivity and specificity with initially N = 1000 shedders. In general, a higher number of shedders N increases the performance of signals, especially if the D value is smaller than the log increase in N. With an initial number of N = 10 shedders, the only signal with sensitivity and specificity >95% is D > 0.6 log with a tenfold increase in shedders and k = 6.

Fig. 3. Simulated sensitivity and specificity of potential signals in six scenarios comparing a two- (a, d), four- (b, e) and tenfold (c, f) increase of the number of shedders between two sets of k = 3 (a, b, c) and k = 6 (d, e, f) samples. Axes correspond to those used for ROC (receiver operating characteristic) curves, only results with sensitivity >50% and specificity >75% are shown. Circles show results for signals based on a difference of means (d), crosses for signals based on linear regression. Open circles/small crosses: N = 10; shaded circles/medium crosses: N = 100; closed circles/large crosses: N = 1000.

Interestingly, with the linear regression method, the specificity of the signal is one minus half the P value: for P < 0.05, the specificity is 0.975, for P < 0.1 it is 0.95, etc., because we only look at increasing trends. Here an increased number of shedders always increases the sensitivity of the method.

Impact of standard deviations

The performance of potential signals was not affected by the mean values for log (fli) and log (Cfaeces,i), but was influenced by the standard deviations. This is illustrated in Figure 4, which shows the performance of potential signals for values of the standard deviation of log (Cfaeces,i), σfaeces = 0.5, σfaeces = 1 and σfaeces = 1.5, as well as the inherent variation of the qPCR, s PCR = 0.3 log10 units (equivalent to 0.5 Ct value in the qPCR), for a fourfold increase in N and k = 6. Similar results are obtained in scenarios with two- and tenfold increase of N and/or k = 6 (results not shown). These results indicate that with lower standard deviation, the performance increases, whereas it decreases with larger standard deviation.

Fig. 4. Simulated sensitivity and specificity of potential signals in the scenario with fourfold increase of the number of shedders and two sets of k = 6 samples with σfaeces = 0.5 (a), σfaeces = 1 (b), σfaeces = 1.5 (c), s PCR = 0.3 (d) and other values as in the baseline. Only results with sensitivity >50% and specificity >75% are shown. Symbols and lines are identical to those in Figure 3.

Discussion

In this study, we have modelled the expected viral concentrations obtained from RT-qPCR measurements of SARS-CoV-2 in community wastewater samples, based on published studies on excretion rates of virus in faeces. Our simulations show that a large variation in the viral concentration per gram of faeces between infected individuals will result in a large variability in the concentrations found in wastewater, especially when the number of shedders is low. As an example, our results show that in a hypothetical catchment area with 10 000 inhabitants and 30 persons shedding the virus daily, the expected variation on subsequent measurements of virus is large, such that 95% of the values fall within a range of 1.6 log units. This range decreases to less than 1 log unit with 300 shedders, but note that this would imply a COVID-19 prevalence larger than 3%, given that not all infected people shed the virus in their faeces. This result suggests that it will be difficult to reliably identify an increase in incidence based on wastewater surveillance data, especially if it is based on a comparison of single wastewater samples in a low-incidence situation. However, the simulations also show that with a high number of shedders and when using the mean result of a number of consecutive tests, the performance of wastewater sampling improves considerably. As the variation depends on the absolute number of shedders rather than the percentage of shedders in a catchment area, with equal incidence, the reliability of signals will be larger in large population areas than in small ones. However, sudden four- or tenfold increases in incidence may be less likely in large population areas.

Based on the need to define signals for early warning in local outbreaks, when setting up wastewater analysis as a new surveillance tool for SARS-CoV-2, the performance of potential signals was explored. Several hypothetical scenarios were defined. It was assumed that a two-, four- or tenfold increase in the number of shedders indicates a sudden increase, which is typical for an (early) outbreak or superspreading event, and should be identified by a signal. Additionally, the signal should be identified in a relatively short timeframe, to allow quick action by public health authorities. We imagined a situation where three wastewater samples are taken per week. The signal was therefore based on k = 3 and k = 6 samples, where the current 1- or 2-week period was compared with the preceding period. To identify a fourfold increase in the number of shedders, the results show that the signal-detection performs best with a signal based on D > 0.3 log increase in the mean number of gene copies. For a tenfold increase this is D > 0.6. As expected, test performance is best if the number of shedders (N) and the numbers of samples compared (k) are large.

Our simulation modelling approach is in many ways similar to that applied by others [Reference Ahmed1, Reference Medema12, Reference McMahan28, Reference Curtis34]. However, by considering the log increase in concentrations and in the number of shedders over time, we did not need to describe the absolute number of gene copies or the difference in water flow between specific wastewater treatment plants, which may be sensitive to unique characteristics of the sampling method and the wastewater treatment plant. At the same time, our approach gives the possibility to identify potential outbreaks by detecting instantaneous increases in the number of shedders. We specifically applied the model to study the performance of potential signals for the detection of early outbreaks at a local scale. This is a highly relevant application of wastewater surveillance of SARS-CoV-2, as COVID-19 is expected to stay endemic, and the detection of re-emergence of the virus may not be the main purpose of the surveillance.

Our results suggest that it will be challenging to apply wastewater surveillance to detect early-stage outbreaks in a low-incidence situation at a local scale. These results may seem to contrast the many promising findings in relation to wastewater surveillance during the COVID-19 pandemic [Reference Fernandez-Cassi2, Reference Peccia5, Reference Prado6, Reference Farkas11, Reference Bibby14, Reference Saguti18, Reference Larsen35]. However, most of these authors refer to a situation where the incidence is high and/or populations contributing to the collected wastewater are large. At the other hand, studies at institutional scale typically involve smaller populations than those referred to in our study [Reference Wang17, Reference Harris-Lovett21, Reference Gibas22]. In these cases wastewater surveillance has shown to be an effective tool to detect the re-emergence of the virus, after it had been eliminated. Our specific purpose, however, was to identify increases in prevalence in a low-prevalence situation and not the detection of re-emergence, in populations where this could be relevant, i.e. those that fall between the small populations of concern at institutional scale and large populations considered in many other studies. The challenge of the analysis of wastewater data from smaller communities in low-incidence situations has been addressed by [Reference D'Aoust36], who indicate that, in such situations, the high day-to-day variance is a key challenge for the interpretation of wastewater surveillance data. Others [Reference Hewitt37] found that, with 10 individuals shedding SARS-CoV-2 in a catchment of 100 000 individuals, there was a high likelihood of detecting viral RNA in wastewater.

For illustration, we show the performance of a few potential signals in some very specific scenarios, where the increase of the number of shedders occurs instantly. These scenarios are not realistic, as increases would often be gradual, and not exactly between two periods in which measurements are taken. The scenarios can be considered examples for which signals are most easily identified, and therefore the performance estimates are probably too optimistic. Still, superspreading events with sudden strong increases in prevalence may occur as well [Reference Lewis38]. For these events, which may have been driving the COVID-19 pandemic [Reference Chen39], wastewater surveillance is expected to give clear signals.

The performance measures ‘sensitivity’ and ‘specificity’ refer to the expected rate of true positives and true negatives and provide the probability of a correct test result given the occurrence (or not) of an outbreak, defined as a two-, four- or tenfold increase in the number of shedders. For a decision maker who is mostly concerned about taking unnecessary action, it will be more important to know the positive predictive value, i.e. the probability of the occurrence of an outbreak, given that you get a signal. As explained in Appendix A, this probability depends on the rate in which outbreaks occur. If it is low, as in an endemic situation with low prevalence, the probability that a signal correctly identifies an outbreak is expected to be low, even if sensitivity and specificity are high.

Several simplifying assumptions have been made in our modelling approach. As other authors [Reference Ahmed1, Reference Medema12, Reference McMahan28], we assume that the daily wastewater sample results do reflect the daily shedding of the virus in a homogeneously mixed wastewater system and that daily samples of the faecal mass (fli) and the viral concentration in the faeces (Cfaeces,i) are independent. Additionally, we assume a proportionality of incidence and number of shedders and do not include the decline in viral concentration in the faeces that is observed over time [Reference McMahan28, Reference Wölfel31, Reference Hoffmann and Alsing40, Reference Miura, Kitajima and Omori41]. The assumption that changes in concentration over time reflect the changes in incidence may not be correct, especially in the situation when the incidence is decreasing and the shedding of virus continues. However, as the signals are meant to identify increases rather than decreases, the importance of this potential shortcoming is expected to be limited [Reference Gerrity42].

The analysis of the impact of the standard deviations shows that the individual variation in the viral amounts being shed largely impacts the performance of signals, especially when the number of shedders is low. Although the available data [Reference Medema12, Reference Wölfel31] suggest that this variation is large, it is not well characterised. To our knowledge, it is not specifically known, neither for symptomatic vs. asymptomatic, nor for vaccinated vs. non-vaccinated people. It is also unknown whether there are differences between SARS-CoV-2 variants. Collection of that type of data would be very useful to predict the performance of wastewater surveillance. PCR measurement errors are included as a small error term, s PCR = 0.15, but additional background noise that is often found [Reference D'Aoust36] and other possible sources of pre-PCR error due to laboratory processing have not been included. The alternative analysis with s PCR = 0.3 illustrates how an increased measurement error also impacts the performance of signals. As most of our assumptions ignore several sources of variation in sampling and analysis of the RNA data, variability in sampling data is expected to be larger than in the model predictions, and therefore the model probably overestimates the performance of signals.

Conclusions

We used a simulation modelling approach to explore the performance of wastewater analysis-based surveillance for the detection of local SARS-CoV-2 outbreaks. Although many studies have shown that wastewater surveillance is highly promising and useful both for following trends in COVID-19 infection pressure and early detection of re-emergence, the method does not seem particularly suitable for detection of local outbreaks in low-prevalence situations. Our study showed that the substantial inherent variance in viral gene copy concentrations shed by individuals infected with SARS-CoV-2 complicates this potential usage of the surveillance tool. More specifically, our model results suggest that, for example, a situation with around 100 shedders of the SARS-CoV-2 virus, at least a fourfold increase of their number and two series of at least six consecutive samples would be needed to reliably obtain a signal (i.e. with more than 95% sensitivity and specificity). This requires intensive sampling, especially if a rapid identification of a local outbreak is required. Moreover, given the simplifying assumptions made in the analysis, such as the exclusion of several sources of variation from sampling and analysis of the RNA data, the obtained performance characteristics can be considered optimistic. As the performance of the surveillance decreases with population size and the probability of a correct signal decreases with prevalence, we do not expect to be able to perform an early identification of an outbreak at a local scale, based on wastewater surveillance.

With our analysis, we have shown that modelling can be a useful tool to increase our insight in the expected results from wastewater surveillance for SARS-CoV-2. With the large amount of data becoming available, the hypotheses generated by the modelling can be studied in detail, which may allow us to verify the underlying assumptions and increase understanding and interpretation of the results obtained.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S0950268823000146

Financial support

This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.

Data availability statement

The model used is provided as Supplementary material.

Appendix A The probability of a correct signal

Given that the sensitivity Se = P(signal|outbreak) and specificity Sp = P(no signal|no outbreak), with occurrence rate (outbreak probability) P(outbreak) = p, it is easy to see that the probability of a correct signal

$$P{\rm ( outbreak\;\vert signal}) = \displaystyle{{\,p\;Se} \over {( {1-p} ) ( {1-Sp} ) + p\;Se}}$$

This equation is evaluated in Figure A1, for some typical values of Se and Sp. It shows that the specificity is particularly important for the signal performance and that, when the outbreak occurrence rate is low (<<0.05), the majority of signals will be false, even with acceptable values of Se and Sp.

Fig. A1. The probability that a signal is correct as a function of the outbreak occurrence rate, for different specificity (Sp) and sensitivity (Se). Black: Sp = 0.95, blue: Sp = 0.9, red: Sp = 0.8; straight line: Se = 0.99, long dash: Se = 0.9, short dash: Se = 0.5.

References

Ahmed, W et al. (2020) First confirmed detection of SARS-CoV-2 in untreated wastewater in Australia: a proof of concept for the wastewater surveillance of COVID-19 in the community. Science of the Total Environment 728, 138764.CrossRefGoogle Scholar
Fernandez-Cassi, X et al. (2021) Wastewater monitoring outperforms case numbers as a tool to track COVID-19 incidence dynamics when test positivity rates are high. Water Research 200, 117252.CrossRefGoogle ScholarPubMed
Li, X et al. (2023) Correlation between SARS-CoV-2 RNA concentration in wastewater and COVID-19 cases in community: a systematic review and meta-analysis. Journal of Hazardous Materials 441, 129848.CrossRefGoogle ScholarPubMed
Duvallet, C et al. Nationwide trends in COVID-19 cases and SARS-CoV-2 RNA wastewater concentrations in the United States. Published online: doi: 10.1021/acsestwater.1c00434Google Scholar
Peccia, J et al. (2020) Measurement of SARS-CoV-2 RNA in wastewater tracks community infection dynamics. Nature Biotechnology 38, 11641167.CrossRefGoogle ScholarPubMed
Prado, T et al. (2021) Wastewater-based epidemiology as a useful tool to track SARS-CoV-2 and support public health policies at municipal level in Brazil. Water Research 191, 116810.CrossRefGoogle ScholarPubMed
Wurtzer, S et al. (2020) Evaluation of lockdown effect on SARS-CoV-2 dynamics through viral genome quantification in waste water. Eurosurveillance 25, 2000776.Google ScholarPubMed
Agrawal, S, Orschler, L and Lackner, S (2021) Long-term monitoring of SARS-CoV-2 RNA in wastewater of the Frankfurt metropolitan area in Southern Germany. Scientific Reports 11, 5372.CrossRefGoogle ScholarPubMed
Karthikeyan, S et al. (2021) High-throughput wastewater SARS-CoV-2 detection enables forecasting of community infection dynamics in San Diego County. 2021; Published online: 2021. doi: 10.1128/mSystems.00045–21Google ScholarPubMed
Hata, A et al. (2021) Detection of SARS-CoV-2 in wastewater in Japan during a COVID-19 outbreak. Science of the Total Environment 758, 143578.CrossRefGoogle ScholarPubMed
Farkas, K et al. (2020) Wastewater and public health: the potential of wastewater surveillance for monitoring COVID-19. Current Opinion in Environmental Science & Health 17, 1420.CrossRefGoogle Scholar
Medema, G et al. (2020) Implementation of environmental surveillance for SARS-CoV-2 virus to support public health decisions: opportunities and challenges. Current Opinion in Environmental Science and Health 17, 4971. doi: 10.1016/j.coesh.2020.09.006CrossRefGoogle ScholarPubMed
Medema, G et al. (2020) Presence of SARS-coronavirus-2 RNA in sewage and correlation with reported COVID-19 prevalence in the early stage of the epidemic in the Netherlands. Environmental Science and Technology Letters 7, 511516.CrossRefGoogle Scholar
Bibby, K et al. (2021) Making waves: plausible lead time for wastewater based epidemiology as an early warning system for COVID-19. Water Research 202, 117438.CrossRefGoogle ScholarPubMed
Gonzalez, R et al. (2020) COVID-19 surveillance in Southeastern Virginia using wastewater-based epidemiology. Water Research 186, 116296.CrossRefGoogle ScholarPubMed
Sherchan, SP et al. (2020) First detection of SARS-CoV-2 RNA in wastewater in North America: a study in Louisiana, USA. Science of the Total Environment 743, 140621.CrossRefGoogle ScholarPubMed
Wang, Y et al. (2022) Early warning of a COVID-19 surge on a university campus based on wastewater surveillance for SARS-CoV-2 at residence halls. Science of the Total Environment 821, 153291.CrossRefGoogle ScholarPubMed
Saguti, F et al. (2021) Surveillance of wastewater revealed peaks of SARS-CoV-2 preceding those of hospitalized patients with COVID-19. Water Research 189, 116620 Published online: 2021. doi: 10.1016/j.watres.2020.116620CrossRefGoogle ScholarPubMed
Patel, JC et al. (2019) Surveillance to track progress toward polio eradication – worldwide, 2017–2018. MMWR. Morbidity and Mortality Weekly Report 68, 312-318.CrossRefGoogle ScholarPubMed
Asghar, H et al. (2014) Environmental surveillance for polioviruses in the global polio eradication initiative. Journal of Infectious Diseases 210, S294S303.CrossRefGoogle ScholarPubMed
Harris-Lovett, S et al. (2021) Wastewater surveillance for sars-cov-2 on college campuses: initial efforts, lessons learned and research needs. International Journal of Environmental Research and Public Health 18, 120.CrossRefGoogle ScholarPubMed
Gibas, C et al. (2021) Implementing building-level SARS-CoV-2 wastewater surveillance on a university campus. Science of the Total Environment 782, 146749.CrossRefGoogle ScholarPubMed
Phillips, N (2021) The coronavirus is here to stay – here's what that means. Nature 590, 382384.CrossRefGoogle ScholarPubMed
Zamir, M et al. (2022) Future implications ofCOVID-19 through Mathematical modeling. Results in Physics 33, 105097.CrossRefGoogle Scholar
Parasa, S et al. (2020) Prevalence of gastrointestinal symptoms and fecal viral shedding in patients with coronavirus disease 2019: a systematic review and meta-analysis. JAMA Network Open 3, 114.CrossRefGoogle ScholarPubMed
Chen, Y et al. (2020) The presence of SARS-CoV-2 RNA in the feces of COVID-19 patients. Journal of Medical Virology 92, 833840.CrossRefGoogle ScholarPubMed
Xiao, F et al. (2020) Evidence for gastrointestinal infection of SARS-CoV-2. Gastroenterology 158, 18311833, e3.CrossRefGoogle ScholarPubMed
McMahan, CS et al. (2021) COVID-19 wastewater epidemiology: a model to estimate infected populations. The Lancet Planetary Health. The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY-NC-ND 4.0 license 5, e874e881.CrossRefGoogle Scholar
Karlen, Y et al. (2007) Statistical significance of quantitative PCR. BMC Bioinformatics 8, 131.CrossRefGoogle ScholarPubMed
Forootan, A et al. (2017) Methods to determine limit of detection and limit of quantification in quantitative real-time PCR (qPCR). Biomolecular Detection and Quantification 12, 16.CrossRefGoogle ScholarPubMed
Wölfel, R et al. (2020) Virological assessment of hospitalized patients with COVID-2019. Nature 581, 465469.CrossRefGoogle ScholarPubMed
Rose, C et al. (2015) The characterization of feces and urine: a review of the literature to inform advanced treatment technology. Critical Reviews in Environmental Science and Technology 45, 18271879.CrossRefGoogle ScholarPubMed
Fenton, L (1960) The sum of log-normal probability distributions in scatter transmission systems. IRE Transactions on Communications Systems 8, 5767.CrossRefGoogle Scholar
Curtis, K et al. (2020) Wastewater SARS-CoV-2 concentration and loading variability from grab and 24-hour composite samples. medRxiv 2. doi: 2020.07.10.20150607Google Scholar
Larsen, DA et al. (2022) Coupling freedom from disease principles and early warning from wastewater surveillance to improve health security. PNAS Nexus 1, 18. doi: 10.1093/pnasnexus/pgac001CrossRefGoogle ScholarPubMed
D'Aoust, PM et al. (2021) Quantitative analysis of SARS-CoV-2 RNA from wastewater solids in communities with low COVID-19 incidence and prevalence. Water Research 188, 116560.CrossRefGoogle ScholarPubMed
Hewitt, J et al. (2022) Sensitivity of wastewater-based epidemiology for detection of SARS-CoV-2 RNA in a low prevalence setting. Water Research 211, 118032.CrossRefGoogle Scholar
Lewis, D (2021) The superspreading problem. Nature 590, 544546.CrossRefGoogle Scholar
Chen, PZ et al. (2021) Understanding why superspreading drives the COVID-19 pandemic but not the H1N1 pandemic. The Lancet Infectious Diseases 21, 12031204.CrossRefGoogle Scholar
Hoffmann, T and Alsing, J (2021) Faecal shedding models for SARS-CoV-2 RNA amongst hospitalised patients and implications for wastewater-based epidemiology. medRxiv. doi: 2021.03.16.21253603Google Scholar
Miura, F, Kitajima, M and Omori, R (2021) Duration of SARS-CoV-2 viral shedding in faeces as a parameter for wastewater-based epidemiology: re-analysis of patient data using a shedding dynamics model. Science of the Total Environment 769, 144549.CrossRefGoogle ScholarPubMed
Gerrity, D et al. (2021) Early-pandemic wastewater surveillance of SARS-CoV-2 in Southern Nevada: methodology, occurrence, and incidence/prevalence considerations. Water Research X 10, 100086.CrossRefGoogle ScholarPubMed
Lui, G et al. (2020) Viral dynamics of SARS-CoV-2 across a spectrum of disease severity in COVID-19. Journal of Infection 81, 318356.CrossRefGoogle ScholarPubMed
Han, MS et al. (2020) Viral RNA load in mildly symptomatic and asymptomatic children with COVID-19, Seoul, South Korea. Emerging Infectious Diseases 26, 24972499.CrossRefGoogle ScholarPubMed
Chavarria-Miró, G et al. (2021) Time evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in wastewater during the first pandemic wave of COVID-19 in the metropolitan area of Barcelona, Spain. Applied and Environmental Microbiology 87, 19.CrossRefGoogle ScholarPubMed
Li, X et al. (2021) Uncertainties in estimating SARS-CoV-2 prevalence by wastewater-based epidemiology. Chemical Engineering Journal 415, 129039.Google ScholarPubMed
Schmitz, BW et al. (2021) Enumerating asymptomatic COVID-19 cases and estimating SARS-CoV-2 fecal shedding rates via wastewater-based epidemiology. Science of the Total Environment 801, 149794.CrossRefGoogle ScholarPubMed
Burns, AM et al. (2018) In healthy adults, resistant maltodextrin produces a greater change in fecal bifidobacteria counts and increases stool wet weight: a double-blind, randomized, controlled crossover study. Nutrition Research 60, 3342.CrossRefGoogle Scholar
Figure 0

Table 1. Reported distributions of the concentration of RNA copies (gene copies, gc) in human faeces (Cfaeces)

Figure 1

Table 2. Reported distributions of the daily faecal mass shed by humans (fl)

Figure 2

Fig. 1. Example of the variation in the observed viral concentration in wastewater Cww (log gene copies per litre per day) due to random variation in the shedding of virus RNA in a simulation with N = 3 (circles), N = 30 (crosses) and N = 300 (triangles) shedders. (a) Forty consecutive single samples. (b) Consecutive means of independent sets of three samples. The horizontal axis can be taken to represent time, for example, daily independent measurements.

Figure 3

Fig. 2. The simulated relation between the number of shedders N and the gene copy concentration in the wastewater Cww (median, 5% and 95% percentiles). Note that both are expressed on a log scale.

Figure 4

Fig. 3. Simulated sensitivity and specificity of potential signals in six scenarios comparing a two- (a, d), four- (b, e) and tenfold (c, f) increase of the number of shedders between two sets of k = 3 (a, b, c) and k = 6 (d, e, f) samples. Axes correspond to those used for ROC (receiver operating characteristic) curves, only results with sensitivity >50% and specificity >75% are shown. Circles show results for signals based on a difference of means (d), crosses for signals based on linear regression. Open circles/small crosses: N = 10; shaded circles/medium crosses: N = 100; closed circles/large crosses: N = 1000.

Figure 5

Fig. 4. Simulated sensitivity and specificity of potential signals in the scenario with fourfold increase of the number of shedders and two sets of k = 6 samples with σfaeces = 0.5 (a), σfaeces = 1 (b), σfaeces = 1.5 (c), sPCR = 0.3 (d) and other values as in the baseline. Only results with sensitivity >50% and specificity >75% are shown. Symbols and lines are identical to those in Figure 3.

Figure 6

Fig. A1. The probability that a signal is correct as a function of the outbreak occurrence rate, for different specificity (Sp) and sensitivity (Se). Black: Sp = 0.95, blue: Sp = 0.9, red: Sp = 0.8; straight line: Se = 0.99, long dash: Se = 0.9, short dash: Se = 0.5.

Supplementary material: File

Nauta et al. supplementary material

Nauta et al. supplementary material

Download Nauta et al. supplementary material(File)
File 14.4 KB