Like clinicians elsewhere, investigators from the University of Colorado, Denver, struggled not only to predict the likelihood that a particular patient would do well with surgery alone but also to convey their estimates in terms that patients would understand.
Andrew J. Vickers, PhD
Current guidelines from the National Comprehensive Cancer Network (NCCN) recommend that patients with oropharyngeal squamous cell carcinoma (OPSCC) receive surgery or radiation therapy delivered with or without chemotherapy. Although minimally invasive techniques have made surgery a more attractive option than it was previously, particularly for younger patients, the question of whether adjuvant chemoradiation therapy (CRT) should be administered with either front-line approach is unsettled.1,2
Like clinicians elsewhere, investigators from the University of Colorado, Denver, struggled not only to predict the likelihood that a particular patient would do well with surgery alone but also to convey their estimates in terms that patients would understand. To solve this dilemma, they collected clinical factors about patients with OPSCC from the National Cancer Database to build a nomogram that can be used preoperatively to help determine the need for adjuvant CRT. The investigators used data from 5065 patients to identify the preoperative variables that best predicted 2 high-risk pathological features: extracapsular extensions in the regional lymph nodes and/or positive surgical margins.
They then used a multivariable regression analysis to create a formula for combining those variables to quantify risk. Finally, they rendered that formula as a nomogram, which in this case is a graphic that illustrates relationships in complex equations so that patients and doctors alike can understand these relationships and make informed treatment decisions.2 In doing so, the investigators joined a growing number of other researchers who are developing nomograms that assess individualized risks in a broad range of clinical settings and tumor types. Despite several limitations, oncology nomograms have been growing in number and gaining in popularity.
Of the total 2895 articles with the words nomogram and cancer that are indexed on PubMed, more than half have appeared in just the past 5 years; in the past 2 months alone, studies involving at least half a dozen new oncology-related nomograms have been published. The major appeal of nomograms is the promise of greater predictive accuracy than can be achieved with conventional tools such as the tumor—node–metastasis (TNM) staging system, which dates to 1953.3 Nomograms can provide more individualized outcome predictions by attempting to calculate the combined effect of many factors, such as tumor genetics, age, race, sex, comorbidities, diet, and behavior.
They can also use tumor data to provide more refined predictions than the TNM system because they can accept tumor size and spread as continuous variables rather than categorizing all tumors into a small number of stages. “Nomograms are valuable tools for doctors, not only because they help doctors make recommendations but also because they help doctors justify those recommenda- tions with numbers that are comprehensible to patients in a way that regression analyses are not,” said Mohammad K. Hararah, MD, MPH, a fourth-year otolaryngology—head and neck surgery resident at the University of Colorado and the lead author of the OPSCC study. “It’s tough for these patients, particularly the younger ones, to decide between undergoing a complicated surgery or combination chemoradiotherapy,” he said. “No one wants to suffer from decades of adverse effects from chemotherapy or radiation when a single surgery would have been sufficient and curative. However, sometimes surgery isn’t enough if pathological evaluation shows features of an aggressive cancer.
The nomogram won’t make the decision easy, but it does at least make it easy for [patients] to understand their own risk and make a well-informed decision.” Nevertheless, as nomograms have prolif- erated in cancer care, so have cautions about how to evaluate individual calculators. Experts say key elements of a well-constructed nomo- gram include an appropriately defined patient population, a precise definition of the primary outcome, identification of the factors that could predict the outcome, an effective statistical model, and a validated outcome (FIGURE).4,5 Importantly, there are limitations to nomograms.
These include the fact that data used to construct nomograms are from a particular timeframe and are therefore static, a lack of accepted standards of reporting, a paucity of information on how effective they are in communicating with patients, and questions about their clinical utility.5 Although there have been many criticisms of nomograms over the years, several major cancer centers have embraced the develop- ment of these calculators as prediction tools. Originally designed as graphical representations, many nomograms are now available as online calculators that yield definitive values.
Memorial Sloan Kettering Cancer Center (MSK) in New York, New York, which helped pioneer the development of nomograms, provides prediction tools for 15 clinical settings on its website, available for patients and physicians alike. Fox Chase Cancer Center in Philadelphia, Pennsylvania, and the Cleveland Clinic in Ohio also offer calculators (TABLE).
“Throughout the spectrum of cancer care, prediction is absolutely critical,” said Andrew J. Vickers, PhD, a research methodologist at MSK who has developed such tools and written extensively on the subject.
GIST indicates gastrointestinal stromal tumor; LMS, uterine leiomyosarcoma.
KEY ELEMENTS OF A NOMOGRAM
The nomograms used in oncology typically start with questions about clinical factors that affect the odds of a particular outcome in a particular patient population. Investigators search for a database that provides clear information about the relevant factors and the outcome among a large number of those patients. Subset analysis allows the investigators to characterize how these factors affect the outcome and to write an equation, which then gets tested on other subsets from the original data. If the equation proves predictive during this internal validation, the nomogram is sometimes tested against information from a separate database of similar patients. Existing nomograms attempt to predict a broad variety of outcomes and studies about new nomograms are published frequently.
Those in studies published in August include a nomogram that uses 15 genes to predict survival in patients with clear-cell renal cell carcinoma,6 one that uses a serological scoring system to predict lymph node metastasis in patients with hepatocellular carcinoma,7 and a nomogram that uses tumor-associated tissue eosinophilia and several other factors to predict survival rates in patients with squamous cell carcinoma of the tongue.8 The accuracy of such nomograms depends on how well each nomogram’s underlying equations match reality and on how closely the patients who use the nomogram match patients in data sets used to develop or validate it. A nomogram’s predictive accuracy hinges on 2 factors: its discrimination and its calibration.
Discrimination is the ability to properly distinguish patients who do expe- rience a particular outcome from those who do not. It is calculated by concordance index, or area under the curve, and it is expressed as a number between 0.5 (no better than a coin flip) and 1.0 (perfect). Calibration is a measure of the nomogram’s accuracy at predicting each particular outcome probability between 0% and 100%. It is a necessary performance measure because a nomogram that is effective at identifying patients with a high risk of a certain outcome can be ineffective at identifying low-risk patients (or vice versa).
Calibration can be expressed as a formula, but it is typically expressed as a graph to give readers a more intuitive feel for a nomogram’s performance. Only a small minority of nomograms have been evaluated through external validations, but the number of such efforts is increasing. This year, for example, has seen investigators from China test the ability of 6 different nomograms to predict nonsentinel lymph node metastasis in patients with breast cancer who had undergone axillary lymph node dissection; their study found that nomograms developed at the University of Louisville in Kentucky and at Seoul National University Hospital in South Korea were more predictive than 4 other models in a population of 105 Chinese patients.9
Other external validation study findings published this year have reported poor results for the IBTR! 2.0 nomogram for the prediction of ipsilateral breast cancer tumor recurrence,10 positive results for a nomogram predicting survival after trimodality therapy for esophageal cancer,11 and mixed results for repeated exposure to a graphic would help educate doctors about how certain factors drive outcomes.13
“This has been a common historical criticism: Without presenting standard errors, the estimates appear too precise,” said Brian L. Egleston, associate research professor 5 prognostic scores designed to stratify risk for patients with pancreatic cancer.12 “Nomograms are definitely improving overall because external validation efforts have become so much more common and because our increasing. The lack of standard errors/ confidence intervals has happened by happenstance, it seems,” he said.
“The nomogram’s predictions can obviously help physicians think about treatment options, but they can also help you explain treatment recommendations to patients, particularly if the recommendation is active surveillance,” said Alexander Kutikov, MD, chief of the Division of Urology and Urologic Oncology at Fox Chase and lead author of the study underlying this nomogram. “Patients have a natural desire to treat every cancer, but a nomogram that says they have a 2% chance of dying from cancer and a 50% chance of dying from something else can make someone consider the wisdom of cancer intervention, which is almost never free of risk.
On the other hand, many patients have zero desire to see predictions about their chance of death, so you do need to individualize the discussion with each patient,” he said.
Mismatches between the patient and the patient population used to construct a particular nomogram make many nomograms of questionable value for many individuals, but the exact extent of the mismatch problem is in question. For example, nomograms developed on Americans of mixed European extraction sometimes fail to predict outcomes for northern European populations.15 Undefinable differences between patients who seek care at academic medical centers and those treated by a community oncologist may also cloud calculations.
“Some of the most individualized nomograms we have come entirely from patient data NOMOGRAMS single number. “The lack of standard errors/ confidence intervals has happened by happenstance, it seems,” he said.
A nomogram from Fox Chase Cancer Center, for example, is designed to predict the likelihood that a patient with kidney cancer will die from cancer or other causes over a 5-year period. Its predictions hinge on 5 factors: race (black, white, other), sex, age (66-96 years), tumor size (0-20 cm), and Charlson Comorbidity Index score (0-10).
The nomogram predicts that a black man, aged 73, with a 4-cm tumor and a score of 5 on the Charlson index, has a 6.4% chance of dying from kidney cancer over a 5-year period, a 46.2% chance of dying from other causes, and a 47.4% chance of survival.14 collected at elite research centers,” Kutikov said. “They can take numerous variables into account, most recently including tumor genomics, because their investigators have access to detailed medical records and generally collect the salient information prospectively.
“These facility-specific nomograms are very good [at] predicting outcomes for patients who subsequently go to those same facilities, but a lot of them don’t work well for other patient populations, even when the patients in question seem to fit right in with the patients who were used to create the nomogram...Patient selection bias is a very subtle thing, and it’s very hard to escape unless you go with giant and largely more generalizable data sets like Medicare that don’t give you the level of granular detail that you’d really like to have.” Despite such shortcomings, nomogram use keeps growing.
The NCCN first included a nomogram in its treatment guidelines 15 years ago, when it started urging doctors to gauge pathologic prostate cancer stages with an algorithm that considers serum prostate-specific antigen levels, clinical stage, and Gleason scores.16 Only a handful of other nomograms have been added to the guidelines since then, but recommendations often call for a certain treatment option if risk exceeds a certain percentage, and nomograms are appearing every day to make those calculations easier. They also seem to help patients choose among recommendations, such as surgery and radiation with or without systemic chemotherapy for patients with OPSCC. “There are a lot of difficult calls in cancer care,” Hararah said. “Any tool that helps us make better decisions, even at the margins, is valuable.”