Introduction
Intensity-modulated radiotherapy (IMRT)Reference Boyer and Yu1–Reference Klein, Low, Sohn and Purdy5 has become popular in radiation treatment of localised prostate cancer, as it has the ability to escalate tumour dose with added advantage to reduce normal tissue toxicity compared with conventional or three-dimensional conformal radiotherapy (3DCRT). Advancement in the treatment planning systems (TPSs) has revolutionised the dose delivery techniques with high precision delineation of gross tumour volume (GTV) and critical organs using high-tech computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET) and CT-PET hybrid systems. Modern TPSs are available with the facilities of CT image fusion with either MRI Reference Sannazzari, Ragona, Ruo Redda, Giglioli, Isolato and Guarneri6 or PET scan, which provide critical information to radiation oncologist during contouring the target volumes, accuracy of which plays a major role in patient treatment planning and tumour dose escalation.
Achievement of best treatment plans, either by 3DCRT or IMRT, depends on the tumour site, shape, size and its extension, critical structure contouring and various dose constraints used for plan optimisation. However, variations in volume delineation are influenced by various technical aspects, such as image resolution, choice of grey scale, CT slice thickness and use of contrast in bladder and rectum.Reference Fiorino, Reni, Bolognesi, Cattaneo and Calandrino7 Accurate contouring of GTV8, 9 is a fundamental prerequisite for any successful conformal therapy. Contouring variations can have major implications in defining appropriate margins for planning target volume (PTV) and delivery of conformal treatment especially in prostate cancer, where dose escalation is required. These variations can occur with different observers or with the same observer contouring at different times. Inter- and intra-observer variability of contoured volumes is a deciding factor in dose escalation, as narrow margins are applied during beam’s eye view (BEV) based conformal shaping of beams.Reference Jones, Hafermann, Rieke and Vermeulen10, Reference Van Herk, Bruce, Kroes, Shouman, Touw and Lebesque11 Numerous studies have reported the correlation between dose–volume data and the clinical observed complications.Reference Dale, Olsen and Fossa12–Reference Storey, Pollack, Zagars, Smith, Antolak and Rosen16 Dose–volume data also depend on contouring of GTV and surrounding critical organs. Correlation between the tumour control and the patient’s complications estimated by the radiobiological models, such as the tumour control probability (TCP) and the normal tissue complication probability (NTCP) and actual clinical outcomeReference Dale, Olsen and Fossa12, Reference Graham, Purdy and Emami15, Reference Lee, Hanks, Hanlon, Schltheiss and Hunt17 depend on contouring of the tumour volume and the normal structures. Hence, variation in contouring significantly influences the treatment planning and clinical outcomes.
In this article, the aim of the study was to evaluate the variation in the contouring of GTV (prostate + seminal vesicles), bladder and rectum and its impact on dosimetric and radiobiological outcome for IMRT planning. We had selected nine patients with localised prostate cancer. Contouring was done by four different observers with the clinical experience of 5–15 years in the field of radiotherapy/radiodiagnosis. A standard plan template was developed with seven non-opposing coplanar fields using 6 MV photon beam (Figure 1).
Material and methods
Patients and immobilisation
Nine patients with localised prostate carcinoma were selected to compare the inter-observer variations in contouring and its effect on dosimetric and radiobiological outcome in IMRT planning. All the patients belonged to stage 1 disease (TNM classification). The mean age of the patients was 62 years. Before planning, all patients were immobilised in supine position with a six clamp thermo-plastic immobilisation cast ORFIT® (ORFIT Industries, Wijegem, Belgium), mounted on a pelvic base plate.
Acquisition of CT scan
CT slices were acquired 24 hours after the ORFIT casts were made. This was done to take care of the set-up variations which might occur due to shrinkage of the cast. Tentative external fiducial markers using 2 mm diameter lead balls were fixed to the ORFIT cast on the anterior surface and two lateral surfaces after matching with sagittal and transverse lasers on the simulator-CT Phebus ® (version 1.2, Mecaserto, France). CT scans were acquired on a flat table top with a multi-slice diagnostic CT scan Light speed ® (GE Medical Systems, Milwaukee, WI, USA), which can scan four slices in a single rotation. The CT scans were acquired after matching all the fiducial markers and by resting the arms on the chest. The patients were instructed to breathe normally during the scanning process. The slices were taken from the upper border of L-4 vertebral body to 3 cm below the level of lesser trochanter of femur. The slice thickness and spacing was 5 mm and the matrix size was 256 × 256 pixels. The acquired CT images were exported to the Eclipse ® (version 7.3.10, Varian medical systems, Palo Alto, CA, USA) 3D TPS for contouring and planning. An average of 55 CT slices were taken to cover the whole bladder, rectum and the volume outside these organs. This was done to collect adequate volumetric information in order to avoid any wrong optimisation by IMRT algorithm due to lack of lateral scatter of radiation beam.
Delineation tools and contouring
Before delineation of contours, a 3D image was created from the imported CT slices. The delineation tools were free hand drawing, live wire (auto Hounsfield unit search), post processing correction and paint brush. The main window showed the axial CT slice for delineation. Two side windows showed the coronal and sagittal reconstruction of the CT and the projected delineated contour in the axial slice.
GTV and normal tissue delineation
The four observers mentioned earlier were asked to delineate the GTV (prostate and seminal vesicles) and the normal tissue structures, the contouring tools mentioned earlier and the window level setting on CT slices. For most of the patients (7/9), the cranial extent of the rectum was taken just below the sigmoid flexure wherever there was a good demarcation between sigmoid flexure and rectum. However, in some patients (2/9) the exact junction was difficult to delineate, in such patients the cranial limit was kept at L5–S1 vertebral level. The caudal extent was defined at the first CT slice above the anal verge. The craniocaudal limits were based on the anatomical definition.
Observers and PTV delineation
Observers A, B, C and D were radiologist, medical physicist, radiation oncologist and senior radiation oncologist, respectively. The contouring done by radiologist (observer A) was considered as gold standard for comparing the contours done by other observers. The GTV margins were expanded to obtain the PTV (prostate + SV + margin). The margins were expanded based on our institutional protocol for 3DCRT, that is, 1 cm along the transverse direction, 1 cm along the cranial caudal direction, 1 cm anteriorly and 0.6 cm posteriorly, in order to reduce the dose to the rectum. Parameters, such as DVH (dose–volume histograms), mean dose, TCP and NTCP of other plans were compared with that of observer A contoured plan. The purpose of comparing TCP, NTCP, mean dose and conformity index (CI) with gold standard contoured plan was to analyse how the inaccurate contouring leads to wrong values of dosimetric parameters such as TCP and NTCP. If contoured PTV is more than actual one and IMRT optimisation is done, the TCP will be lower and NTCP will be higher because of larger PTV overlapping to the critical organs (bladder and rectum), and vice versa. On the other hand, if IMRT planning is done with accurately contoured volumes, then TCP will be higher and NTCP will be lower. We analysed impact of contouring variations or inaccurate contouring on dosimetric and radiobiological parameters with accurate contouring (observer A).
IMRT planning and evaluation
An IMRT plan template with seven non-opposing coplanar fields using dynamic multileaf collimators (MLCs) was formed to avoid the variation due to field setup, beam angle (Figure 1) and dose constraints. The beam data of Varian CLINAC DHX (2300 CD) linear accelerator, with dual photon beam energy 6 MV and 15 MV and millennium 80 MLC were used. Millennium 80 MLC system has 40 pairs of leaves in each bank and MLC leaf width projected at isocentre is 1 cm. The dose–volume optimisation (DVO-7234) algorithm was used for generating the IMRT plans with 6 MV photon beam. DVO-7234 is an IMRT planning algorithm, which is used in Eclipse TPS for IMRT plan optimisation. Table 1 shows the upper and lower dose constraints of different organs/PTV used for IMRT planning. The upper limit specified that only prescribed (set) organ volume can exceed the defined (set) dose limit, for example, 20% of bladder volume can receive dose of >55 Gy. Depending on the optimisation result, minor variations were done in upper dose constraint and volume limits for bladder and rectum. For all the plans, a minimum of 200 iterations were performed. For all patients, the IMRT optimisation was done only for observer A (gold standard contoured data) and these optimised fluences were exported to other observers contoured data using the export fluence for verification plan option in TPS. Only dose calculation was done based on the imported fluences for observers B, C and D. Plans were then evaluated using cumulative DVH, CI, TCP, NTCP and standard deviation (SD) of the contoured volumes. Statistically significant differences were analysed using Wilcoxon matched pair test (p < 0.05). CIReference van’t Riet, Mak and Moerland18 analysis was preformed to determine how tightly 95% isodose line is conformed to PTV.
where V PTV95% was volume of PTV receiving 95% of reference dose, V PTV was volume of PTV and V T was volume of tissue receiving 95% of reference dose. For DVH analysis, the values of V20, V35, V50 and V70 (defined as the % of rectum or bladder volume receiving dose of 20, 35, 50 and 70 Gy) were also measured.
Comparison using radiobiological models
The TCP and the NTCP were calculated for all the patients.The TCP was defined by a Poisson statistics model and was written by
where
and
where v j was the volume of jth voxel in the target volume where entire target was divided into k number of voxels (i.e., subvolumes) and d j was corresponding dose per fraction. N c was clonogenic cell density. α (radio sensitive parameter) was coefficient of lethal damage and BED was biologically effective dose of v j voxel, assumed to be uniformly irradiated. For calculation purpose we had taken α = 0.1 Gy–1 and α/β = 1.5 Gy.
The NTCP was calculated by Lyman modelReference Lyman19 for IMRT and 3DCRT techniques. Model parameters given by Burman et al.Reference Burman, Kutcher, Emami and Goitein20 and compiled by Emami et al.Reference Emami, Lyman and Brown21 for high grade complications associated with partial or full organ irradiation were used. The values of n and m were 0.5 and 0.11 for bladder, 0.12 and 0.15 for rectum, respectively. TD50 was 80 Gy for both bladder and rectum. The NTCP was calculated for bladder and rectum using the equation
Details are given in ref. 19.
Results
Inter-observer variations in contoured volumes
The inserted contoured volumes by observers B, C and D were compared with the inserted contoured volumes of observer A. The maximum average variation in contoured GTV was 3% (SD = 8.4) for observer B. Standard deviation was higher (SD = 11.52) for observer C with average variation in contoured volume 1.6%. Observer D had least variations in average contoured GTV. For bladder, the maximum average variation found in contoured volume was 2.55% (SD = 4.12) for observer B. Observer C had maximum SD = 5.69 with average variations 2.2%. For rectum, the average variation in contoured volume was (13.2%) highest with maximum SD = 6.77 for observer C. Observer D had the least average variation 6% (SD = 6.77) in contoured rectum volumes. When these variations were statistically compared, statistically significant differences were not observed in contoured GTV, bladder and rectum volumes. Observer D had the least average variations in contoured volumes of GTV, bladder and rectum.
Inter-observer variations: DVH analysis
The average mean dose (D mean) to bladder was 33.49 Gy (SD = 14.49) for observer A. Observer C had maximum mean dose to bladder 34.30 Gy (SD = 14.04). Observer A had highest average mean rectum dose (D mean) 38.23 Gy (SD = 8.26). Figures 2–4 show the inter-observer variations of the dose distribution, received by whole PTV/organ (bladder or rectum) volume in individual patient. Maximum differences were observed between observers A and C for all the dosimetric parameters under investigation for bladder and rectum (Tables 2 and 3).
Inter-observer variations in conformity indices
Average CI was highest (CI = 0.85) for observer A, because IMRT plan optimisation was done for observer A. Observer C had the least average CI = 0.72 (Figure 5). Least average variations were observed in average contoured volumes of GTV, bladder and rectum for observer D, consequently observer D had the higher CI (0.79) than observer B and C. Statistically significant differences were observed in CI for observers B, C and D (p = 0.008 for A–B, p =0.007 for A–C, p = 0.008 for A–D).
Inter-observer variations in NTCP and TCP
Average value of NTCP for bladder was 0.361% (SD = 0.036) for observer A, with maximum NTCP = 0.47% (SD = 0.05) for observer C. No statistically significant difference was found in terms of inter-observer variations in NTCP for bladder (p = 0.086, observers A–C) (Figure 6). When comparing the NTCP rectum, the average value for observer A was 1.49% (SD = 0.12). Observer B had maximum average NTCP = 1.86% (SD = 0.12) (Figure 7). Statistically significant difference was found in terms of inter-observer variations in NTCP for rectum (p = 0.04, observers A–B and p = 0.05, observers A–C). The average TCP value was 99.94% (SD = 0.035) for observer A, whereas observer B had least average TCP value 96.72% (SD = 3.737) (Figure 8). Average TCP values for observer D and C were 99.49% (SD = 0.83) and 97.01% (SD = 2.78), respectively. Statistically significant difference was found in terms of inter-observer variations in TCP for prostate (p = 0.037, observers A–B and p = 0.01, observers A–C). The inter-observer variations in TCP were not statistically significant between observers A and D (p = 0.065).
Discussion
In this study, the contouring done by radiologist was considered as gold standard. We assumed that the contouring done by radiologist was correct because he could interpret human anatomy better and was familiar with our contouring tools. This was supported by results also; the results of radiologist and senior oncologist were not significantly different. To see the impact of contouring variations or wrong contouring on IMRT planning, standard IMRT plans were created using gold standard contouring for all patients. By doing this, we had correct values of CI, TCP, NTCP and other parameters under investigation for comparison. The standard plans were exported to other observers contoured set and above-mentioned parameters were measured. By doing so, we had seen how wrong contouring could lead to wrong values of CI, TCP and NTCP.
Various authors have investigated the potential impact of inter-observer variations in contouring the GTV and critical structures during the planning process.Reference Cellai, Biti and Banci22–Reference Van de Steene, De Mey and Vinh Hung25 Jones et al.Reference Jones, Hafermann, Rieke and Vermeulen10 estimated a SD of inter-observer variation in BEV margins of ∼3 mm in prostate. Lebesque et al.Reference Lebesque, Allison, Kroes, Touw, Shouman and van Herk26 showed a variation of 2.5–3% and 7–9% in rectum/bladder and rectum wall/bladder wall volumes respectively, within the same observer. They observed that inter-observer variability in contoured volumes was significant and a major variable under actual clinical conditions. Rasch et al.Reference Rasch, Remeijer and Koper27 reported significant systematic differences between two institutions in rectal volume and DVH due to non-uniform policies in the cranial limit definitions. Seddon et al.Reference Seddon, Bidmead and Wilson28 also reported much larger differences in rectal contouring by different observers.
In this study, large variations were observed in contoured GTV and rectum volumes. When we analysed contoured patient data slice by slice in transverse plane, we observed significant inter-observer variation along the craniocaudal direction for prostate and rectum. In many slices, overlapping between rectum and PTV was large, where prostate was not clearly distinguishable. Therefore, it appears that inter-observer contour variations correlates well with the radiological interpretation of human anatomy by different observers. The impact of inter-observer variation on contoured volumes was much greater for rectum and PTV as it was dependent on the subjective judgement of exact position of sigmoid flexure, moreover large CT spacing may increase the possibility of different definitions of cranial borders. This holds true for prostate as well, where it is difficult to delineate the caudal limit. The variation in the volume of the prostate may be because of large CT slice thickness (5 mm) resulting in partial volume effects and difficulty in defining the prostate apex by CT images alone.Reference Algan, Hanks and Shaer29–Reference Wilson, Ennis, Percapio and Peshel34 Bladder volume variations in contouring were not significant as bladder was accurately visualised by all the observers because of well demarcated fat planes between the prostate and the bladder base. Studies have shown that bladder contrast may even be detrimental as it may limit proper visualisation of anterior margin of seminal vesicles.Reference Fiorino, Reni, Bolognesi, Cattaneo and Calandrino7 Hence, it is not recommended to use bladder contrast for IMRT planning purposes.
In this study, we observed the significant impact of inter-observer variations in contouring on dosimetric and radiobiological parameters for PTV, bladder and rectum. Further it was seen that PTV coverage was higher and NTCP (bladder and rectum) was lower, when PTV volume was small with large bladder and rectal volumes, for same dose constraints. When bladder and rectum volumes were smaller, NTCP (bladder, rectum) was higher (P8, P9) but inter-observer variation in NTCP was minimal. The impact of variations in contouring volumes can be correlated with CI. There was a strong dependence of CI on the contoured volumes. Larger the variation in contoured volumes lower was CI. Least variations in contoured volumes were observed between observers A and D and calculated results in terms of TCP and NTCP, were not statistically significant.
The inter-observer variations in contoured rectal volume shows that the dose received by organ mainly depends on its contoured volume (Figure 6). Ragazzi et al.Reference Ragazzi, Mangili, Fiorino, Cattaneo, Bolognesi, Reni and Calandrino35 have suggested that the greater impact of organ motion on DVH/dose statistics is expected in those patients with full rectum at the time of CT simulation, therefore in such cases, there is a high probability of a significant systematic difference between rectum shape during simulation and during therapy with poor correlation between calculated DVH during planning and true DVH. Meijer et al.Reference Meijer, van den Brink, Hoogeman, Meinders and Lebesque36 proposed the use of dose-wall histograms and/or normalised dose surface histograms in rectum DVH to avoid the impact of different rectum filling. But exact delineation of rectum wall itself is not easy as most TPS do not have this option.
The values of dose volume related parameters (V20, V35, V50 and V70) depend on the total contoured volume as well as on the overlap/gap from PTV. As we can see from Table 2 for observer C, the contoured volume was less than observer A, but mean dose and other related parameters were higher. Similar trends can be seen from Table 3, for observer B. Apart from having implications on the treatment planning process; these volumetric variations in the inserted contours by various observers have potential impact on the TCP and NTCP.
IMRT optimisation can nullify the effect of the variations in PTV on the PTV dose, but at the expense of higher rectum and bladder dose. Contouring the larger PTVs will not suffice the purpose of IMRT because the dose escalation will not be possible because of surrounding critical structures. As IMRT is more conformal than 3DCRT, so the contouring inaccuracies which may have lesser implications in 3DCRT can have major impact on IMRT planning and treatment outcome.
Based on this study and previous studies,Reference Fiorino, Reni, Bolognesi, Cattaneo and Calandrino7, Reference Cox, Zagoria and Raben30, Reference Sandler, Bree, McLaughlin, Grossman and Lichter33, Reference Ragazzi, Mangili, Fiorino, Cattaneo, Bolognesi, Reni and Calandrino35 few measures can be taken in order to reduce the inter-observer variations in delineating the tumour and critical organs. In this study we observed the least variations in contoured volumes between radiologist (observer A) and senior oncologist (observer D), so in the new radiotherapy centres, which are going to start 3DCRT and IMRT, radiologist’s perfect guidance can reduce the variation in contouring volumes. In order to improve the delineation accuracy of prostate apex, use urethrogram and MRI or CT-MRI image fusion. Smaller CT slice thickness (≤5 mm) should be used for increasing contouring accuracy and reducing contoured organ volume variations. Upper and lower border limits of rectum should be standardised based on anatomical bony land marks. If possible, bladder should be filled to the maximum comfort level of the patient. Full bladder not only reduces the bladder mean dose but also pushes away the small bowel from radiation field.
Conclusions
The accurate contouring of target and surrounding critical structures is the base of any successful conformal radiotherapy treatment. Accurate contouring may result not only in improved treatment outcome in terms of disease free survival but also in decreased morbidity by reducing radiation induced damage to surrounding critical structures.
To the best of our knowledge, this is the first report concerning the impact of inter-observer contouring variability on dosimetric and radiobiological parameters simultaneously on bladder, rectum and PTV in IMRT treatment planning. Maximum variations were observed in the craniocaudal directions for rectum and prostate, hence special care should be taken while contouring these limits. We acknowledge the fact that the patient number was small, however it has brought out the importance of accurate contouring in IMRT planning. We suggest large multicentric studies to validate the impact of these variations on TCP and NTCP and to correlate clinical observed results, as there is a possibility of gathering significant number of patients and collection of heterogeneous data would give information on large spectrum of dose–volume combinations which would be suitable for modelling purposes.