INTRODUCTION
Clinical infection with Clostridium difficile (CD) usually occurs due to a disruption of the intestinal flora following antibiotic treatment, mostly with broad-spectrum antibiotics [Reference Kelly1–Reference Martinez3]. CD is a Gram-positive bacterium, which is able to produce spores, two types of toxins, A and B, and, in some strains, a binary toxin [Reference Akerlund4]. Concerns have been raised specifically for the PCR ribotype 027 and other hypervirulent strains that may cause higher mortality, and may have a greater potential to cause hospital outbreaks [Reference Warny5–Reference Walker8].
In the USA, it was estimated in 2013 that 250 000 cases were annually attributable to a healthcare-associated CD infection [9]. An incidence of 7·4 per 10 000 patient-days of healthcare-associated CD infection was reported based on data from three states in the USA in 2010 [10]. In 2008, a European study aimed to assess the extent of CD infections (CDI) and tested patients with suspicion of CDI as well as patients developing diarrhoea 3 or more days after hospital admission. Three major Danish hospitals participated and found that 8% of tested patients had an infection with toxin-positive CD [Reference Bauer11]. In the same year, a rise in number of patients with CDI was observed in the Capital Region of Denmark [Reference St-Martin12, Reference Olsen13], which was attributed to several hospital outbreaks. In 2009, the number of CDI started to rise in the neighbouring Region Zealand, probably due to the transfer of patients [Reference St-Martin12]. Results from studies using whole genomic sequencing techniques suggest that the community serves as a large reservoir of CD carriers, where transmission routes between patients within and outside hospitals are complex. Therefore, nationwide surveillance of CD, not only focusing on hospital-associated cases, is important [Reference Eyre14].
Kola et al. reported that in 2011, 14 European countries had an ongoing nationwide surveillance system for CDI [Reference Kola15]. The surveillance systems were either case-based, laboratory-based or a combination of both. All systems required manual or active reporting in the data collection. Increasing evidence suggests that automatic electronic surveillance based on laboratory data is superior to conventional surveillance with respect to completeness and timeliness [Reference Dubberke16]. For the current Danish surveillance system, Departments of Clinical Microbiology (DCMs) actively report findings of gastrointestinal bacteria including CD to Statens Serum Institut (SSI). Data are entered in a database called the National Registry of Enteric Pathogens (NREP), which is used for surveillance of gastrointestinal pathogens in general [Reference Bager17]. Following the outbreaks in 2008, the DCMs were requested from 2009 to submit selected isolates for further typing to the national reference laboratory at SSI [Reference Olsen13, 18].
In this study, we validated a new electronic surveillance system for CD based on the Danish Microbiology Database (MiBa) [Reference Voldstedlund19]. The aim of the new surveillance system (MiBa-based surveillance) is to improve national surveillance of infectious diseases in terms of completeness and timeliness. It is also envisioned to reduce the burden of reporting by machine-to-machine communication and replace manual procedures of reporting. The aim of this paper is to validate a computer algorithm identifying CD cases with toxigenic CD from MiBa, and compare the MiBa-based surveillance with the existing surveillance system for CD. Furthermore, we determined the completeness and quality of MiBa to local extracts from five DCMs.
METHODS
MiBa-based surveillance
MiBa is a real-time database that automatically collects microbiological test results from each Danish DCM at the time the electronic report is sent to the physician who requested the analysis. Reports are sent by a national standard transfer protocol. MiBa started collecting data from 1 January 2010 [Reference Voldstedlund19]. Patients are identified with their Danish civil registration number (CPR number), a unique number that each Danish resident gets upon birth or immigration [Reference Pedersen20]. For this study, data were extracted from MiBa from 1 January 2010 to 31 December 2014. The extract was made on 22 December 2016.
We defined a CD case as a person (i.e. a unique CPR number) with a positive CD culture or PCR detection of toxin A and/or B and/or binary toxin. Results specifying that the strain was non-toxigenic were excluded. Table 1 shows the steps in the computer algorithm used to identify positive test results from MiBa.
CD, Clostridium difficile; MiBa, Danish Microbiology Database.
Data were imported from three related tables within the MiBa data model. The DCMs use different laboratory information systems with variable levels of compatibility with the MiBa data model. Although, the input data had differences in data structure and content, the majority of the information in the reports are structured and coded. A first step removed records where the strain was indicated to be non-toxigenic. Subsequently, results for positive CD cultures were retrieved. The next steps aimed at extracting the positive results for CD PCR, both from coded information, from pre-set text strings and free text. Finally, all data selected along each step were combined into one dataset containing all results positive for CD.
National surveillance data for the evaluation of the surveillance system
To assess the performance of the MiBa-based surveillance on the national level, we compared it to the existing surveillance from NREP. Surveillance for gastrointestinal pathogens is laboratory based, and all DCMs report to NREP on a weekly basis. According to the ministerial order for notifiable diseases, findings of enteropathogenic bacteria need to be notified, including Salmonella enterica, Campylobacter jejuni/coli, Yersinia enterocolitica, Shigella spp., Vibrio cholerae, diarrhoeagenic Escherichia coli, as well as other bacteria that by the diagnostic laboratory are judged to cause gastrointestinal disease [21]. Because CD is not specifically mentioned in the list of agents, it is interpreted to fall under the wording ‘other bacteria’ in agreement with the local DCM. When a CD case is registered in NREP, it includes patients with a positive culture for CD or PCR detection of toxin A and/or B and/or binary toxin for CD. Patients are identified with their CPR number. NREP records a new CD case if it occurred 6 months or more after the first positive sample of the previous case. The study period of the national surveillance data was from 1 January 2010 to 31 December 2014.
Regional data for validation of data quality of the new electronic surveillance system
To validate the quality and completeness of the MiBa-based surveillance, we obtained data from five DCMs representative of different parts of Denmark: Capital Region of Denmark (Hvidovre, Herlev and Rigshospitalet), North Denmark Region and Region Zealand. The extracts included all results (from cultures and PCR) positive for CD, including information on toxin production/presence of toxin genes. If specified as non-toxigenic, results were excluded. For North Denmark Region and Region Zealand, the study period was 2010–2014 and for Capital Region of Denmark 2011–2014. The data linkage to the MiBa-based surveillance for the DCM North Denmark Region and Region Zealand was done on the sample identifier and the CPR number. For the DCM in the Capital Region of Denmark, the data linkage was only available on the CPR number.
Data analysis
Children <2 years old were excluded from all datasets as well as temporary CPR numbers.
The European surveillance protocol for CDI excludes children <2 years old with positive CD test results unless clinical evidence to prove otherwise [22]. Children <2 years old have been reported in several studies as asymptomatic carriers [Reference Larson23–Reference Shim26].
The number of CD cases, as identified through the MiBa-based surveillance and NREP, was described by age, sex and geographic region of residence, which were retrieved from the Danish Civil Registration System [Reference Pedersen20].
The overlap between systems was assessed and discrepant cases were further investigated in order to understand the reasons for the discrepancies.
Data management and analysis were carried out by use of the statistical software SAS/STAT™ 9·4 (SAS Institute Inc., Cary, North Carolina, USA).
RESULTS
There were 60 698 positives samples and 22 748 cases in MiBa. We excluded 163 cases with a non-valid CPR (503 samples) and 1333 children <2 years of age (3259 samples).
The MiBa-based surveillance system vs the current national surveillance NREP
We obtained 21 252 CD cases from the MiBa-based surveillance compared with 13 896 CD cases reported to NREP (Table 2).
F, females; M, males; MiBa, Danish Microbiology Database.
* MiBa-based surveillance: surveillance system based on the Danish Microbiology Database.
† NREP: National Registry of Enteric Pathogens.
‡ The geographic region of residence could not be retrieved for 139 patients from the MiBa-based surveillance.
The median age of patients in the MiBa-based surveillance was 73 years [interquartile range (IQR) 59–83 years], similar to the age distribution of cases in NREP (74 years, IQR 60–83 years). The proportion of men and women was similar in both surveillance systems (Table 2).
Table 3 shows the number of concordant and discordant CD cases between the current national surveillance NREP and the MiBa-based surveillance system.
DCM, Department of Clinical Microbiology; MiBa-based surveillance, surveillance system based on the Danish Microbiology Database; NREP, National Registry of Enteric Pathogens.
* DCM Hvidovre, DCM Herlev and DCM Rigshospitalet.
† Comparisons based on sample identifier instead of unique patient (CPR number).
The comparison between NREP and MiBa-based surveillance revealed an under-reporting to the NREP, especially in the period 2010–2012 (Table 3 and Fig. 1). This under-reporting was most significant for the Capital Region of Denmark and may have been due to cumbersome manual extraction. When it was clarified in 2013 that also PCR-based detection of CD toxins should be reported, completeness was improved. Figure 1 shows that the national and regional trends with NREP were consistently lower than the MiBa-based surveillance but with similar trends. Completeness increased over time and the highest number of discrepant CD cases found only in NREP was from 2010.
Validation of the data quality in the MiBa-based surveillance system
Table 3 shows the number of concordant and discordant CD cases and positives samples identified in the MiBa-based surveillance and the regional data. The discrepancies are described in detail below.
The MiBa-based surveillance vs three DCMs from Capital Region of Denmark
Of the eight cases found in the database from the Capital Region of Denmark, but not identified by the MiBa-based surveillance, four patients were not recorded in MiBa at all, and four were PCR positive in the local extract but PCR negative in MiBa. For the 31 patients found only in the MiBa-based surveillance, three were reported in late December 2014 in MiBa and in early January 2015 in the DCM's extract. This discrepancy was found because the three samples were recorded in the Capital Region of Denmark database 1 day after the cut-off date of our comparison. For the remaining 28, no clear explanation could be found for the discrepancy.
The MiBa-based surveillance vs DCM from North Denmark Region
Among the three samples found only in the data from DCM North Denmark Region, all were reported in MiBa as a negative PCR for CD. Of the 15 samples found only in the MiBa-based surveillance, six samples were recorded on 31 December 2014 in MiBa and recorded on 1 January 2015 in the DCM's extract. For seven samples, the cultures were performed on a biopsy, and were therefore not included into the local extract. For the last two samples, there was no clear reason.
The MiBa-based surveillance vs DCM from Region Zealand
The 11 samples recorded only in DCM Region Zealand included a borderline test result of the PCR reaction, which was manually corrected before the report was sent to MiBa, while the original test result was kept in the local database. Therefore, the result in MiBa was the correct one. The 18 samples found only in the MiBa-based surveillance were recorded as a positive PCR result in MiBa, but negative in the local extract. In these cases, the final report had been manually corrected from negative to positive, based on a manual interpretation of a weak but significant PCR signal. The original test result was kept in the local database and the result in MiBa was the correct one.
DISCUSSION
In this study, we validated a new fully automated surveillance system for CD cases based on laboratory reports in MiBa.
When compared with the current surveillance through the NREP, we found that the MiBa-based surveillance recorded one-third more patients with a positive test for CD than did NREP. This suggests that the number of CD cases in Denmark has been underreported, possibly because of cumbersome manual reporting procedure. When looking in detail at the few patients that were identified by NREP, but not by the MiBa-based surveillance, most of the test results were from 2010, suggesting that start-up problems with electronic transfers of reports to MiBa have been solved in later years. The trends observed on regional and national level, for the MiBa-based surveillance and NREP suggested that reporting to MiBa is more complete than to NREP.
By comparing the cases identified in MiBa with extracts from five of the 13 DCMs, we were able to assess the quality and completeness of the MiBa-based surveillance system. Data from the MiBa-based surveillance almost completely matched the positive tests for CD in these five selected regional datasets. Finally, most discrepancies in the comparison analysis could be explained, which provides reassurance for the use of MiBa as a future data source for laboratory-based surveillance of CD cases.
There are two aspects of the algorithm that could theoretically lead to discrepancies, although this was not seen in the presented validation studies. In the first place, rule 2 of the algorithm deletes records specifying non-toxigenic strains. However, if a DCM does not report this information to MiBa, the algorithm cannot distinguish non-toxigenic strains and will keep them in the dataset – and thereby overestimate the number of toxigenic cases in MiBa. In order to solve this issue, there is a need for DCMs to report laboratory results in a more detailed manner to MiBa. In the second place, rule 2 can only remove cases where a positive culture was followed by a negative PCR for toxins, if these two results were reported in one record. A more advanced algorithm would need to be developed to handle situations in which these two results are reported in separate records with different dates. Finally, rules 6 and 7 depend on free text. This makes the system sensitive to changes in the content applied at the local DCM. It could lead to a loss of records if a DCM introduces new texts and requires regular evaluation of text strings and updates of the algorithm.
However, compared with earlier reported electronic surveillance systems for CDI [Reference Dubberke16, Reference Benoit27], data in MiBa are complete at a national level. Most information in the reports sent to MiBa is compliant to a national standard transfer protocol, rendering data structured and coded, making the data delivery relatively robust over time. At four CDC Prevention Epicenters hospitals, an automated surveillance system for CDI was developed [Reference Dubberke16], which was found more reliable to use, compared with manual reporting. However, different algorithms were used for each hospital to cope with local differences in available data, potentially resulting in different disease rates. Another automated surveillance was developed in the USA [Reference Benoit27]. This system is also based on a selected cluster of hospitals, and only CDI related to an admission is captured. The majority of data sources were based on free text, and data were captured by text mining. In neither system were the data population based, and it was not possible to trace individuals across hospitals systems. However, in both systems, cases were classified according to hospital admissions. Since 2015, the presented MiBa-based surveillance has been integrated into the hospital-acquired infection database (HAIBA), where individual cases are linked to patient administrative data from hospital systems, to monitor hospital-onset hospital-acquired CDI and community-onset hospital-acquired CDI [Reference Gubbels28]. To our knowledge, few countries have developed nationwide population-based electronic surveillance systems, including individual person identification, making it possible to trace individuals across institutions and link data across registries. The Infection Intelligence Platform being developed in Scotland shares many of the same features as the Danish MiBa-based surveillance and population-based studies at national level on CD are taking place [29, Reference Banks30].
It is a limitation of the current NREP surveillance system and the MiBa-based surveillance system that they do not provide information on subtyping. An additional surveillance is run at the national reference laboratory at SSI with specific focus on the surveillance of CD subtypes, based on selected CD isolates referred from the DCMs to SSI for subtyping. In the near future, MiBa will be able to handle information on subtypes in a structured and coded form. Integrating this information in the MiBa-based surveillance would mean a great improvement in the usefulness of the surveillance data.
In the present study, we did not assess the improvements in timeliness or the reduction in burden of reporting by the laboratories. However, we are certain that the MiBa-based surveillance is by far superior to the current NREP surveillance, in both respects, although automated systems do have important pitfalls that deserve attention. The landscape of modern healthcare is highly dynamic, and organisations, laboratory techniques and IT systems change constantly. One must not forget the centrally and locally based resources that are required to maintain the MiBa-based surveillance and to secure data quality.
In conclusion, despite different local laboratory information systems, we show that automated laboratory-based national surveillance for CD is both feasible and complete. The benefit of such a system is that it does not require active reporting. The existing surveillance systems underestimated the incidence of CDI. Based on the new and validated MiBa-based surveillance system, we found that CD, with about 4500 cases per year, may be the most common cause of bacterial gastroenteritis in Denmark, probably exceeding the annual number of confirmed Campylobacter infections that ranged between 3730 and 4350 in the period from 2010 to 2015. We also concluded that the MiBa-based surveillance system for CD can effectively replace the current surveillance system.
This nationwide MiBa-based surveillance system for CDI can greatly strengthen surveillance and research in various applications. For example, when linked to patient administrative data, the relation to the healthcare system can be assessed as it is shown in HAIBA.
ACKNOWLEDGEMENTS
The authors thank all Departments of Clinical Microbiology and all regions of Denmark for providing data. The authors also thank Kenn Schultz Nielsen (Department of IT-Projects and Development, Danish Health Data Authority) for IT development and support. The authors thank Katharina E. P. Olsen, Joan Nevermann Jensen and Søren Persson (Department of Microbiology and Infection Control, Statens Serum Institut) for their support on the Clostridium difficile subtyping surveillance from the reference laboratory of the Statens Serum Institut. The authors also thank the MiBa board and the HAIBA stakeholder group (representatives from the Capital Region of Denmark, Region Zealand, Region of Southern Denmark, Central Denmark Region, and North Denmark Region).
This study was carried out as part of the development of the Hospital-Acquired Infections Database (HAIBA), which was funded by the Danish Ministry of Health.
DECLARATION OF INTEREST
None declared.
ETHICAL CONSIDERATIONS
This study was approved by the Danish Data Protection Authority as part of the development of the Danish Hospital-Acquired Infections Database (registration number 2015-54-0942).