Rapid advances in the technological development of precise and conformal radiation therapy and its incorporation into the clinic have generally outpaced our ability to test these technologies in a clinical trial format.
About the lead author:
Gustavo Olivera, PhD
21st Century Oncology ATD
Madison, WI 53719
Matthew Abramowitz, MD
Assistant Professor, Co-Chairperson Genitourinary Site Disease Group, Director of Compliance, Department of Radiation Oncology, Sylvester Comprehensive Cancer Center University of Miami
Why is this article contemporary?
Rapid advances in the technological development of precise and conformal radiation therapy and its incorporation into the clinic have generally outpaced our ability to test these technologies in a clinical trial format. With improvements in conformity utilizing intensity modulation and set-up accuracy utilizing stereotactic techniques and tumor motion management, margins continue to shrink. We hope this will translate into improved patient outcomes by allowing higher tumor doses and decreased morbidity by improved avoidance of critical structures.
However, these advances create new questions. Respiratory motion management based upon modeling and fiducial tracking assume minimal changes in the patient’s respiratory cycle, both during a single fraction and during a course of therapy. In addition, how changes in body shape that are attributed to weight loss, bowel gas, or slight differences in patient body position that may occur inter-or intra-fraction affect the delivered dose when using these new modalities remains unclear.
This study is contemporary in how it utilizes existing resources in a massive data set to evaluate the implications of these changes measured in delivered dose to patients. With the growing integration of adaptive dose recalculation, identification and correction of these issues may now be possible.
Background and Purpose:
In vivo dosimetry and verification (IVV) in conjunction with adaptive dose recalculation (ADR) is a synergistic set of processes that provides insight into actual treatment. Our intent was to test an automatic, multicenter procedure for IVV and ADR for all patients and all fractions treated on 14 helical TomoTherapy units across the United States. Additionally, our secondary goal was to create a system with metrics to flag for possible issues, establish trending, and determine possible clinical impact. Establishment of this system could both provide internal recommendations for daily IGRT and clinical improvements using IV and ADR findings. Moreover, our final goal was to evaluate deviations of the cumulative dose at the end of the treatment using Quantitative Analyses of Normal Tissue Effects in the Clinic (QUANTEC) recommenda-tions for organs at risk.
Materials and Methods:
A system for IVV and ADR that retrieves and processes machine and patient information during treatment was created. The IVV portion includes (1) checking consistency values using machine encoders and (2) using the imaging detector data and comparing a reference frac-tion with respect to daily treatment deliveries using the Gamma metric. The ADR component computes daily and cumulative doses; DVH data are compared plan data and the flagging system. Thus, a reviewer can (1) identify machine, setup, and/or anatomical issues and (2) infer possible clinical impact.
Results and Conclusions:
Across multi-center clinics (n=14), 153,330 IVV and 66,294 ADR fractions were analyzed. The extent of in vivo flags was independent of an individual clinic’s volume. The number of in vivo flags considerably decreases as a function of the length of time that the system is used. This was accomplished by tailoring IGRT procedures to specific anatomical sites and specific patients. With respect to disease site, 5% of all prostate treatments and more than 20% of head and neck treatments triggered some IVV action level. ADR results demonstrated that cumu-lative doses at the end of treatment for head and neck patients exceed QUANTEC limits for 9% of parotids glands and 2% larynxes. Following treatment for breast cancer, approximately 10.5% of patients exceeded QUANTEC limits for lung and 3% for heart at the end of treatment. Thus, these data suggest ADR and IV are a synergistic set of processes that allows flagging and quantifying potential clinical impact to analyze dosimetrical information on patient registries.
Radiotherapy has evolved from three-dimensional conformal radiotherapy (3D), to intensity modulated radiotherapy (IMRT)1,2 and many forms of image-guided radiotherapy (IGRT).1 Comprehensive quality assurance (QA) procedures are evolving as the technology evolves.3-8 In IMRT, the typical QA procedures involve the use of phantom measurements or the delivery of a plan to a portal imager.9 IMRT QA is typically performed before clinical treatment.
Verification of what happens during the time that of actual treatment delivery may provide significant insight to evaluate the actual course of treatment. Possible issues that may go undetected with QA before treatment can be found using IVV.10,11
Indeed, it is currently possible for the user not only to access the imaging detector information, but also gather machine sensors and monitor chambers, daily CTs from the data base or logs to create a more comprehensive IVV program. The TomoTherapy archive contains information such as couch encoders, monitor chambers, daily megavolt computed tomography (MVCT) and other sensors that provide key information during treatment. This information may be paramount to discern the cause of inaccuracies between machine, setup or anatomical changes.
Herein, an approach has been employed to combine the in-vivo dosimetry with other types of in-vivo verification (IVV). We attempted to optimize issues related to robustness and specificity of in vivo dosimetry by using information from adaptive dose recalculation (ADR). Hence, the ADR is able to compute daily doses, cumulative doses, and dose-volume histograms (DVHs) using the daily CT and machine information.
Thus, we generated a patient verification process that uses both IV and ADR as synergistic tools to analyze 153,330 IV fractions and 66,294 ADR fractions (corresponding to 3,687 ADR patients).Workflow and System Description
The verification system presented has two major components: the IVV portion and the ADR. The IVV was deployed in advance of the ADR system. Approximately 87,000 fractions over 14 clinics were generated and processed having only the IVV system. The remaining fractions (~66,000), which corresponds to approximately 3,600 patients, were processed using IV and ADR.
Information was gathered from patient archives for both processes. The TomoTherapy machine digitally records information, such as couch position, daily CT, and machine output from its imaging detectors and stores the information in the patient archive. The process begins by analyzing the patient archive information. The only human intervention to generate results consists of using the TomoTherapy patient data management system to archive treated patients two times per day. Our system automatically processes the data, generates daily and weekly reports, and sends an e-mail notification to designated personnel if a flag is out of tolerance. The reports are web based and can be accessed from any computer allowed. The flagging system is an important component to aid in the identification of possible issues and will be described later in the manuscript. Most of the processing time consists of parsing the data and generating movies used to analyze the data. The report could be generated in few minutes—per patient, per fraction—if there was a mechanism to retrieve the new information immediately after each fraction without the need to archive the patient data.
The generation of IVV and ADR systems are transparent to clinic staff and add minimal time to daily patient treatment operations. IVV provides flags related to treatment consistency using exit detectors and machine encoders. ADR computes daily and cumulative doses and DVHs to analyze possible daily and cumulative clinical impact. A more detailed description of IVV and ADR can be found in the Supplemental Materials and Methods available on the online version of the manuscript.Gamma Flagging and Number of Patients per Day
To evaluate the impact of clinical load on differences between planned vs delivered doses, we examined between clinical volume and gamma flagging. Figure 1a represents the number of fractions for each of the 14 participating clinics from the beginning of the implementation of the IVV program until the end of May of 2012 (10 months). In total, there were 42,866 fractions considered during this period of time that were used to answer the question of whether clinics with more patients treated per day have more in-vivo dosimetry flags or not. As shown in Figure 1b, in which each index (1 to 14) represents a clinic, there is hardly any correlation between patient load and the percentage of yellow and red flags (correlation coefficient -0.14). In fact, the busiest clinic is the one with the lowest fraction of yellow and red flags. Therefore, dose differences between planned and delivered were not correlated with patient volume in each clinic.
Flagging Trend as Function of Time of Use
The reduction of the number of flags as a function of time was analyzed for the first 10 months that the IVV system was used. Figure 2 shows the reduction of red flags as a function of time after the clinics actively started the IVV program. The trend includes translational (Trans Red) and rotational offsets (Rot Red), kV-MV similarity metric (Sim Red), machine output related (Outp Red, Var Red) and exit dosimetry gamma (Gamma Red). A decrease in gamma flagging of 7% was observed after 10 months.
IVV Segmented by Anatomical Site and Couch Setup Verification
Several studies have reported in vivo dosimetry studies for different anatomical sites.12-22 Figure 3a shows the number of cases for each anatomical site used on this study. The total number of fractions analyzed was 153,330. The largest number corresponds to prostate, head and neck, breast, pelvis and lung. Figure 3b is the fraction of gamma green, yellow, and red flags for each of the anatomical sites considered. Prostate and brain had the lowest yellow and red flags. For prostate, the level of red flags is under 5%, while red and yellow flags for prostate cases are on the order of 10%. Pelvis cases show similar behavior to prostate, but with a slightly higher occurrence of combined red and yellow gamma flags. Some pelvic cases involve very long fields with inguinal nodes and a large area of the femur treated at once. The gamma failures for these cases mostly relate to contralateral leg placement after the beam exits the target. In cases within the regions of head and neck, bone and connective tissue, mediastinum, lymphatic, skin, abdomen, and breast and lung, the red flags are between 10% and 20% and the combined yellow and red flag components are between 40% to 50%. The anatomical site information was determined using the ICD-9 and ICD-10 codes as reference.
In Figure 3c a per-clinic tally is shown to analyze cases where the image registration indicated an offset and that differs in more than 3mm with respect to the actual couch encoders setup offset information. A difference of this type corresponds to either a decision to not move the patient or that the therapist forgot to move the patient after accepting the image registration. If something like that happened to the patient, the physician receives that information on the daily IGRT report, which is emailed and needs his/her approval. This verification is performed using the couch encoders to verify that the actual offset on the couch is the same as was accepted during patient registration process. A total of 85 were found in 14 clinics in 153,330 delivered fractions during a period of approximately 3 years. This corresponds to a rate of 0.0013% per clinic per year.
Plan Dose and Cumulative Dose
In figure 4 the results of the planned doses and the cumulative dose at the end of the treatment computed by the ADR for the organs at risk is analyzed. This corresponds to 66,294 fractions in 3,687 patients. In Figures 4 a—h for each anatomical site the percentage of patients that a particular organ at risk violate the QUANTEC criteria either during planning (blue column) or as cumulative dose at the end of the treatment (red column) is displayed. No plan adaptation or modification had been performed in any of these cases during the course of treatment. Therefore, these data allow establishing a reference for possible improvements that can be obtained either by adaptation or other improvements with respect to the current state of technology and treatment delivery.
In our head and neck results (Figure 4a), approximately 21% of the patients deviated from the QUANTEC criteria for the parotids on the original plan. We observed that in 9% of the patients the cumulative dose at the end of the treatment for the parotids deviate from the QUANTEC, comparable to what has been previously reported.23-25 The cumulative dose at the end of treatment to the larynx exceeded the tolerance for approximately 2% of the patients. For breast cases (Figure 4b), the regions at risk that violated plan constraints were lung and heart for 10% and 3.8% of the patients, respectively. Analyzing the cumulative dose at the end of treatment for breast cases, the regions at risk that violated cumulative dose constraints, were lung and heart for 10.5% and 3% of the patients, respectively.In-vivo verification
In vivo verification actions are trigger by a flag system. A table is first presented to the user with the 6 flag values for each particular fraction, as well as tools to verify the adequacy of the patient image registration. These metrics include the three degrees of translation (“Translation”); and roll correction offset (“Rotation”); a similarity metric between the kV and MV CTs (“Similarity”); metrics regarding mean output (“OutputMean”); and variance of the machine output (“OutputVar”); and a gamma value (“Gamma”)26 between the reference imaging detector signal (or reference sinogram27,28) and a particular fraction imaging detector signal (or fraction sinogram). Each of the flags is color-coded: as green, yellow and red.
The similarity metric is a cross correlation between the Planning CT and the daily MVCT. The calculation is performed for the region where MVCT is available. The default flags for gamma values between the reference and fraction imaging detector signals are: “Green” if more than 97.5% of the voxel are within 3% and 3mm, “Yellow” if more than 95% and less than 97.5% of the points are within 3% and 3mm, and “Red” if less than 95% of the points are within 3% and 3mm; but these parameters are also user definable.
The imaging (“exit”) detector signal is used to analyze the dosimetric consistency resulting from the compound effect of patient setup, anatomical changes, machine output, and leaf behavior during each treatment. To do that, an exit signal that is chosen to represent the treatment plan is compared to the exit signal of each other daily fraction. This reference signal is used as a surrogate for the signal that the imaging detector should see if the plan is delivered with no uncertainties. This approach will be valid provided that the setup for that reference fraction is adequate and that the anatomical changes between the plan and actual fraction are small.
To compare the reference imaging detector signal and the daily fraction imaging detector signal, the gamma metric is used. The gamma metric is additionally used in two other fashions: a per beam (angle) direction and a per slice (longitudinal) direction. The TomoTherapy plan is generated using 51 projections. The gamma per beam direction is equivalent to comparing portals on the 51 directions of the delivery for the whole length of the treatment. Because TomoTherapy delivers treatment by slice, it is also useful to determine a gamma value per slice. In the report, a sagittal view of the planning CT provides a key to the anatomical region for the corresponding gamma values per slice The gamma per slice is particularly useful to understand the inferior-superior regions that may need more attention due to lack of gamma consistency on those portions. By using this information the IGRT daily procedures can be tailored to acquire daily CT on the portions that may be a problem. This feature was used on many occasions during the course of this work by changing or expanding the region where the CT information was available and allowing ADR to be used to evaluate the possible clinical impact on those regions.
The lack of access to the user of the post-treatment pulse-by-pulse sinogram data made it necessary to create a process to select a reference fraction to be used as the plan reference. Before the implementation of the in vivo verification program, pretreatment MV CTs did not always cover the entire length of the target. We then decided to implement a pretreatment CT scan that covers the entire length of the target (or as much as possible) during the first two fractions. After a detailed verification of the registration at the time of treatment for those two fractions, and the similarity of the anatomy between the planning CT and daily CT is checked, one of those fractions is chosen as reference fraction if the information is consistent. If the first two fractions are not consistent, there are procedures to evaluate possible causes of the inconsistency to continue the process of selecting a reference fraction. The adaptive dose recalculation portion can also be used to validate the reference fraction. As will be described on the next section, ADR allows the computation of daily dose distributions and DVHs and compares them to the planned ones. Therefore, ADR is currently a very valuable tool to evaluate and validate the chosen reference fraction too.
The exit sinogram for the reference fraction is then normalized to the machine output (using the reading of monitor chamber to avoid any bias due to output changes. After the reference fraction selection, the output-normalized image detector signal is used to compare each daily imaging detector signal to assess the dosimetric daily delivery consistency.
As implemented, the in vivo verification requires an authorized user to approve the reference fraction. After several fractions are delivered, should the analysis indicate that a new reference fraction is needed, there is a process for evaluating and setting a new reference fraction. In the near future, a reference fraction based on the prediction of the detector signal, using a database of predicted signal as a function of patient thickness and patient to detector distance, will be incorporated using the algorithm described by Kapatoes et al.29
Adaptive Dose Recalculation
Sometimes it may be necessary to understand the clinical impact of the treatment evolution as well as from the in-vivo verification flags.30,31 For such that task we developed the adaptive dose recalculation process. The program is completely automatic and gathers the information from the same archives that the in-vivo verification does. The program extracts the daily MVCT and creates a merged CT at the location where the patient was treated while taking into account the image registration used for treatment. A merged image is a CT that contains the complete MV CT that was imaged with the planning kV CT filling in the rest of the image. This merged CT is used to compute the fraction dose. The dose calculator is a convolution superposition32-34 developed in house. To be more efficient, our implementation is GPU based. A deformable registration algorithm based on morphons35,36 was used to generate daily contours and analyze daily and cumulative DVHs. The deformable registration also provides deformation maps that allow mapping back the daily doses to the planning CT in order to compare planned dose respect to cumulative actual delivery. A GUI allows the user to analyze the registration, contours, daily and cumulative doses. When the structure is indicated with a colored background, it means that a cumulative flag is present. When no colored background appears for a structure, this indicates that only daily fractions flags were present. Also, a different background color is used when the original plan violated some of the constraints. ADR helps also to understand the specificity and robustness of the in vivo verification in its ability to flag clinically relevant issues.30,31
The ADR flagging system has a daily dose and a cumulative dose component. The cumulative dose flagging system gets the information about the anatomical site from the ICD9 and ICD10. Anatomical sites are grouped on the following categories: head and neck, breast, brain, prostate, sarcoma, lung, pelvis, spine, abdomen and others. For each one of the anatomical sites and for each particular fractionation schema a set of tolerances is chosen. These tolerances are used to check the deviations that the cumulative dose can have in respect to the fraction pro-rated planning dose. The tolerances for organs at risk are based on QUANTEC,37 RTOG (RTOG 0615, 0837, 0920, 0617, 0619, 0848, 0630, 0815, 0926, 0921, 0822, 0529, 0813, 0915) and internal 21st Century Oncology  recommendations. For targets D-max, D-%, D-mean, Conformity Index (CI), Homogeneity Index (HI), Total Volume Coverage (TCV) based on ICRU 83 and internal 21st Century Oncology38 recommendations are used. The daily dose flagging is triggered from the percentage that D-max, D-mean and D-%, CI, HI, TCV deviate for each structure with respect to the plan and is based on internal 21st Century Oncology recommendations.
The ADR can also be used to validate the adequacy of the reference fraction used for in-vivo verification. By analyzing the patient registration, daily doses, and cumulative doses and DVH, the reference fraction can be verified as good surrogate for the patient plan.
In this study, we analyzed 153,330 IV and 66,293 ADR fractions across 14 multicenter clinics to determine the effect of various clinical factors on planned and delivered doses. In one of the largest analyses of its kind, we demonstrate that the degree of in vivo gamma flagging was independent of an individual clinic’s volume (Figure 1).
Indeed, the use of IV in our hands in conjunction with ADR appears to provide insight into actual treatment. In this large data set, our automatic, multicenter procedure for IV and ADR for all patients and all fractions treated across the United States moves from the theoretical to the contemporary. Our work suggests that this system provides internal recommendations for daily IGRT and clinical improvements using IV and ADR findings. Thus, we successfully created a system with metrics to flag for possible issues, establish trending, and determine possible clinical impact. We were also able to evaluate deviations of the cumulative dose at the end of the treatment using QUANTEC recommendations for organs at risk.
Interestingly, the extent of in vivo flags was independent of clinic volume. The number of in vivo flags decreases considerably as a function of the length of time that the system is used. We were able to tailor IGRT procedures to specific anatomical sites and specific patients based on the information provided by the system. Again, our findings suggest 5% of all prostate treatments and more than 20% of head and neck treatments triggered some IV action level. It appears for certain anatomical sites, long CTs may not reduce the number of IVV flags but in vivo dosimetry still indicates possible problem locations. ADR results demonstrated that cumulative doses at the end of treatment for head and neck patients exceed QUANTEC limits for 9% of parotid glands and 2% of larynxes. It is educational for treatment purposes that 10.5% of patients exceeded QUANTEC limits for lung and 3% for heart at the end of treatment.
Finally, these data help establish ADR and IV as a synergistic set of processes that allows flagging and quantifying potential clinical impact to analyze dosimetric information on patient registries. The implication for use in further prospective studies or integrating key focus retrospective studies on a grand scale cannot be overlooked.
This study represents one of the largest multicenter experiences with the use of an IVV system coupled with an ADR system. In the spirit of “big data,” a total of 153,330 IVV fractions and 66,294 ADR fractions, corresponding to 3687 ADR patients, for a multicenter study were analyzed and interpreted. Data generated by IVV provides a useful system of flags to be used in conjunction with an ADR program. Taken together, these tools can identify potential issues with treatment delivery that may otherwise go undetected.
About the Authors
21st Century Oncology (GO, XM, DP, SK, CM, EF, DD, LK, AM, SEF, DG) and Yale Medical School (AD).
Address correspondence to:
Gustavo Olivera, 21st Century Oncology ATD; 555 D’onofrio Dr Suite 104 Madison WI 53719. Tel: 608 332 6274 Fax: 608 332 6274 E-mail: Gustavo.Olivera@21co.com.
Conflicts of interest: None.