BACKGROUND
Tuberculosis (TB) genotyping became available in the late 1980s, coinciding with the resurgence of TB in the late 1980s and early 1990s in the United States. In New York City (NYC), epidemiological investigations of TB outbreaks in hospitals [1–Reference Frieden4] and prisons [Reference Valway5] were aided by the availability of the new genotyping tools. As many as one-third of TB cases in NYC were attributed to recent transmission [Reference Frieden6]. These studies of TB genotyping in NYC were either hospital-based or conducted among selected sub-populations.
Before 2001, the NYC tuberculosis control programme used selective genotyping of TB cases to investigate TB outbreaks and clusters, potential false-positive cultures, conduct surveillance of TB drug resistance, and in periodic cross-sectional surveys of genotype clustering [Reference Frieden6–Reference Munsiff8]. In addition, several hospitals (the TB Network) performed genotyping of all TB cases diagnosed at these hospitals in 1992–1994 [Reference Tornieporth9, Reference Friedman10] and 1996–1997 [Reference Magnani11] and one hospital performed genotyping of all isolates from 1990 onwards [Reference Geng12].
In 2001, the NYC TB control programme began universal genotyping of Mycobacterium tuberculosis isolates to increase the identification of TB transmission, efficiency of epidemiological investigation and identification of false-positive cultures. The purpose of the present analyses was to determine the extent of genotype clustering and factors associated with genotype clustering after nearly a decade of declining TB incidence.
METHODS
The study population included all incident, culture-positive TB cases in NYC from 1 January, 2001 to 31 December, 2003. The first M. tuberculosis isolate or a sub-culture was submitted to the Public Health Laboratory by the clinical laboratory in which M. tuberculosis was first identified. Specimens were sent to the genotyping laboratories for IS6110 Southern blot hybridization at the Public Health Research Institute (PHRI), and for spacer oligonucleotide typing (spoligotyping) at the Wadsworth Center following standard procedures [Reference Kamerbeek13, Reference Kwara14]. Cases were classified as having a clustered genotype if both the IS6110 restriction fragment length polymorphism (RFLP) and the spoligotype of the isolate were indistinguishable from that of ⩾1 isolate in the study period. Other cases were classified as having non-clustered genotypes. Strains known to have been transmitted in the early 1990s were identified from the genotyping laboratory database. The methods for investigating false-positive cultures have been reported elsewhere [Reference Clark15].
We compared the characteristics of case-patients having clustered genotypes to those with unique genotypes. The characteristics included demographic, socio-behavioural, clinical, and bacteriological features of the case-patients, such as infectiousness, presence of acid-fast bacilli (AFB) from a respiratory source and highest number of bacilli on smear microscopy, results of chest radiographs, and prior history of TB. Multidrug-resistant TB (MDR TB) was defined as an isolate having resistance to at least isoniazid and rifampin. Homelessness was defined as the lack of fixed, regular housing or living in a public or private shelter or single-room-occupancy hotel at any time before or at diagnosis, or during TB treatment. Substance abuse included injection and non-injection of illicit drugs during the 12 months before diagnosis. Having prior TB disease was defined as diagnosis and treatment of TB disease in NYC ⩾12 months before the current diagnosis. We assumed that one index case per cluster represented reactivated disease and that clustered cases minus one case per cluster represented recently acquired disease [Reference Alland16, Reference Ellis17].
Data collection
We used patient information from the TB registry and genotype information from the TB genotype database of the NYC Department of Health and Mental Hygiene. Demographic and clinical information for each patient was obtained from patient interview and medical-record review, by trained Bureau of Tuberculosis Control staff, on standard data collection forms.
Genotype clusters were investigated for the presence of epidemiological links between cases. Information from the initial patient and contact interviews such as shared contacts, potential sources of TB, potential locations of transmission such as prior hospitalization or shelter residence, prior history of TB infection and disease, characteristics of clustered case-patients, and classification of epidemiological links were entered into a molecular cluster investigation database. If a link was not found from the available information, a re-interview was attempted using a structured questionnaire. An epidemiological link between cases was defined as naming the other person as a contact, having contacts in common, or reporting having been in the same location prior to diagnosis.
Statistical analysis
Statistical analyses were performed with PC SAS software version 8.02 (SAS Institute, Cary, NC, USA). Frequencies and percentages of clustered isolates according to case characteristics were determined. A χ2 analysis was used for comparison of categorical variables. The Wilcoxon rank-sum test was used for comparison of medians of continuous variables. Differences were considered significant at P<0·05. Odds ratios and 95% confidence intervals for an isolate being clustered were derived employing the generalized estimating equation (GEE) regression method, to account for individual characteristics correlated by having the same genotype [Reference Liang and Zeger18, Reference Zeger and Liang19].
The genotyping activities and this analysis received ethical oversight and approval by the New York City Department of Health and Mental Hygiene Institutional Review Board and was reviewed by the Associate Director for Science of the National Center for HIV, STD, and TB Prevention of the CDC; it was determined not to be human-subjects research requiring review.
RESULTS
During the study period, 3436 TB cases were reported in NYC. Of these, 2623 (76·3%) were culture-positive and 2467 (94%) were genotyped; 59 (2·4%) isolates were false-positive cultures and 2408 were included in subsequent analyses (Fig. 1). Compared to isolates that were not genotyped, those that were genotyped were more likely to be from Hispanic TB patients (30% vs. 19%, P=0·001), from patients with only pulmonary disease (69% vs. 59%, P=0·002), and patients with AFB smear-positive disease from a respiratory source (43% vs. 35%, P=0·030). Isolates from Asian TB patients were less likely than other TB cases to be genotyped (28% vs. 42%, P=0·001).
Among the 2408 TB case isolates that were genotyped, 873 (36·2%) had a pattern that was indistinguishable to that of another TB case within the study period (i.e. the clustered cases). Thirty-one percent (272/873) of the clustered isolates had fewer than four copies of IS6110. The 873 clustered cases formed 212 genotype clusters; the median cluster size was 2 (range 2–85) (Fig. 2). There were 266 (11%) cases with strains believed to have been widely transmitted in the early 1990s. Twenty percent (176/873) of clustered cases had one or more epidemiological links to another case in the cluster; among clustered cases with historical strains, 17% had epidemiological links to another case in the cluster. The difference in the proportion of cases that were epidemiologically linked among historical strain cases compared to other strain cases was not statistically significant. An estimated 27·4% (873 minus 212) of the 2408 cases were due to recent infection that progressed to active disease during the study period.
The characteristics of cases and the proportion of cases having a clustered genotype according to these characteristics are shown in Table 1. The crude and adjusted odds ratios for factors associated with having a clustered genotype using GEE logistic regression are shown in Table 2. The following factors were independently associated with genotype clustering after adjusting for the variables that were associated with clustering in the bivariate analyses: younger age, birth in the United States, homelessness, substance abuse and presence of TB symptoms. Among non-US-born patients, the number of years of residence in the United States was not associated with having a clustered genotype. Fourteen percent (236/1651) had been in the United States ⩽1 year before diagnosis; of these only 24 (10%) were examined at entry as part of the immigration screening, three were clustered cases. Of the 212 clusters, 170 (80·1%) had ⩾1 non-US-born patient in the cluster. Among the 207 clusters in which all patients in the cluster had known country of origin, 90 (42·5%) clusters had only non-US-born patients in the cluster, 37 (17·5%) had only US-born patients, and 80 (37·7%) had both US- and non-US-born patients in the cluster (Fig. 3). In 19 (21%) of the 90 clusters with only non-US-born patients, transmission was believed to have occurred in NYC based on epidemiological links between two or more of the cases in the cluster; 17 were clusters of 2 or 3 cases, the others had 8 and 10 cases respectively. No epidemiological links were identified among the cases in the remaining 71 clusters.
AFB, Acid fast bacilli.
* P<0·05.
GEE, Generalized estimating equation; AFB, acid fast bacilli.
Eight clusters had more than 10 cases in the cluster; five of these were clusters caused by strains known to have been transmitted in the early 1990s. The IS6110 gel images and spoligotype patterns for these clusters are shown in Figure 4(a, b). Two of the largest genotype clusters (n=85 and n=39 respectively) were associated with recent outbreaks of TB in homeless persons in NYC. Only one of these was highly localized in one single-room-occupancy hotel. The other was a strain known to have been widely transmitted in the early 1990s and associated with stays in shelters for the homeless [Reference Zeger and Liang19]. Sixteen clusters had one or more MDR TB cases in the cluster; in four (25%) clusters, all with two cases each, both cases had MDR TB isolates; the remaining 12 (75%) clusters had both MDR and non-MDR TB cases in the cluster. There were 12 strain W [Reference Munsiff21, Reference Plikaytis22] cases, all of which had MDR isolates.
DISCUSSION
The proportion of clustered cases in the study period was comparable to that seen in NYC at the height of the TB epidemic, when extensive transmission was occurring [Reference Edlin2–4, 6, 9, 10, 16, 23–Reference Small27]. This was probably due to under-ascertainment of genotype clustering in prior studies and highlights the importance of the duration of the study period and sampling on the level of clustering [Reference Vynnycky28, Reference van Soolingen29]. A longer time period during which cases can enter the sample, and a wider population base, increase the likelihood of identifying clustered cases [Reference Glynn, Vynnycky and Fine30, Reference van Soolingen31]. The effect of duration is thought to level off after 2–3 years. Our study sample, comprising all incident TB cases in NYC over a 3-year period, covered a longer time span and a larger population than did prior studies, which were either hospital-based or of much shorter duration. One of these studies reported 39 (37·5%) clustered cases of 104 cases diagnosed at a large urban hospital from 1989 to 1992 [Reference Alland16]. A city-wide study of prevalent culture-positive cases, during a 1-month period in 1991, found that 37% (126/344) of cases were clustered; strains of ⩽3 IS6110 bands were excluded from the analysis [Reference Frieden6]. Reports of TB cases diagnosed at the TB Network hospitals found 68 (40·7%) of 167 cases diagnosed during 1992–1993 [Reference Friedman10] were clustered; 94 (31%) of 302 diagnosed in 1992–1994 [Reference Tornieporth9] and 97 (54%) of 180 in 1996–1997 [Reference Magnani11] had clustered genotypes. A 10-year study of TB cases diagnosed at one hospital reported that 63% of cases diagnosed in 1993 had clustered genotypes; the proportion of clustered cases decreased significantly, to 31%, among cases diagnosed in 1999, supporting the notion that genotype clustering was likely to have been higher in earlier years [Reference Geng12]. Compared to other cities in the United States, the proportion of TB cases with clustered genotypes in NYC is lower than that seen in previously published studies from St Louis (MO), Baltimore (MD) and Los Angeles (CA), 39%, 46% and 59% respectively [Reference McConkey32–Reference Barnes34]. The NYC clustering rate was much higher than that seen in San Fransisco (CA), however, where 19% were clustered in a large population-based study over a 7-year period [Reference Jasmer35]. Some of the difference in the level of clustering across studies may be due to differences in the proportion of non-US-born TB patients and definitions of clustering. Cases in non-US-born persons are less likely to have isolates with clustered genotypes. San Fransisco had the highest proportion of non-US-born TB patients, 63%; on the other hand, the San Fransisco study used a shorter time frame for defining clustering (i.e. two or more isolates within 1 year with indistinguishable isolates). Differences in genotyping methods may also have contributed to the levels of clustering.
Historical strains transmitted in the early 1990s still contributed significantly to genotype clustering of TB cases in NYC in 2001–2003. At least one-third of clustered cases were due to such strains. The extent to which these cases are due to reactivation of infection which occurred in the early 1990s when transmission was widespread, as opposed to recent transmission is not known. This poses a challenge for real-time investigation of genotype clusters, the purpose of which is to identify recent transmission and opportunities for intervention to prevent further spread. Cluster investigations are complex and resource intensive; information on timing and results of previous tuberculin skin tests and absence of overlapping stays among non-US-born patients can be useful for confirming or excluding recent transmission. However, the number of epidemiological links found between cases in this densely populated and mobile city is small. The presence of epidemiological links among cases in clusters of historical strains suggests that these clusters also require investigation for recent transmission.
Our experience in NYC has shown us that universal, real-time TB genotyping is feasible in a large urban centre [Reference Clark15]. Laboratory participation and coverage has been high; the median time from specimen collection to spoligotype result is currently 39 days and 68 days for IS6110 RFLP. The added value from universal genotyping was 57 additional links, 17 additional sites of transmission, four additional investigations in congregate settings in which additional contacts and four secondary cases were identified. Length of unnecessary treatment decreased among patients with false-positive cultures. In addition, genotyping has allowed us to rule out transmission among TB cases that are clustered in place and time but having different genotypes. In such instances, smaller contact investigations can be conducted, rather than the larger case-finding efforts that would be required in settings with transmission confirmed by genotype [Reference Clark15].
There are groups in which genotype clustering continues to be high. US-born, homeless and substance-abusing patients had high rates of clustering, compared to patients without these characteristics. The number of homeless TB cases increased from 89 in 2001 to 109 in 2003 in NYC (data not shown). Separate investigations have shown evidence of increased transmission of TB in three residences for homeless individuals over the past 3 years: one a large 1001-bed shelter [36], another a facility for homeless persons, and a third a single-room-occupancy hotel (New York City Department of Health and Mental Hygiene, unpublished data).
In summary, despite declining TB incidence TB transmission continues to occur in NYC, among persons who are US-born, homeless and substance abusers. Continued TB control efforts that focus on interrupting transmission in these groups are ongoing. Universal TB genotyping assisted the TB control programme to better understand the dynamics of TB transmission in the city.
ACKNOWLEDGEMENTS
The authors gratefully acknowledge Thomas R. Navin, M.D. for critical review of the manuscript and Midelyn Montilla for assistance with manuscript preparation.
DECLARATION OF INTEREST
None.