An approach using machine learning to analyze genomic and clinical data from patients with myelodysplastic syndromes could replace the gold standard of predicting how long patients may live with the disease.
Aziz Nazha, MD
An approach using machine learning to analyze genomic and clinical data from patients with myelodysplastic syndromes (MDS) could replace the gold standard of predicting how long patients may live with the disease, according to Aziz Nazha, MD, in a presentation during the 2018 ASH Annual Meeting.
“A random survival forest (RSF) algorithm was used to build the model, in which clinical and molecular variables are randomly selected for inclusion in determining survival, thereby avoiding the shortcomings of traditional Cox step-wise regression in accounting for variable interactions,” the researchers wrote. “Survival prediction is thus specific to each patient’s particular clinical and molecular characteristics.”
The machine learning model outperformed International Prognostic Scoring System (IPSS) and Revised IPSS (IPSS-R) in predicting survival outcomes and risk for acute myeloid leukemia (AML) transformation among a training cohort of 1471 patients. Accuracy as assessed by concordance (c)-index showed the machine learning model correctly predicted overall survival (OS) 74% of the time and leukemia-free survival (LFS) 81% of the time compared with 66% and 73%, respectively, for IPSS, and 67% and 73% for IPSS-R.
In addition, the researchers conducted several feature extraction analyses to identify the most important variables that impacted patients’ outcomes, as well as the least number of variables that produced the best prediction. From most important to least, variables included cytogenetic risk categories by IPSS-R, platelets, mutation number, hemoglobin, bone marrow blast percentage, 2008 World Health Organization diagnosis, white blood cell count, age, absolute neutrophil count (ANC), absolute lymphocyte count (ALC), TP53, RUNX1, STAG2, ASXL1, absolute monocyte counts (AMC), SF3B1, SRSF2, RAD21, secondary versus de novo MDS, NRAS, NPM1, TET2, and EZH2.
During his presentation at the meeting, Nazha, assistant professor, Department of Medicine, Cleveland Clinic School of Medicine, demonstrated how the clinical and mutational variables can be entered into a web application that can run the trained model and provide OS and AML transformation probabilities at different time points that are specific for each patient, adding, however, that the model is not yet available for clinician use.
With these variables, the machine learning model also outperformed in predicting OS and LFS by mutations only (64% and 72%, respectively); mutations plus cytogenetics (68% and 74%); and mutations, cytogenetics plus age (69% and 75%). The researchers noted the addition of mutational variant allelic frequency did not significantly improve prediction accuracy.
Similarly, in the 831 patients included in the validation cohort, the RFS algorithm predicted OS 80% of the time and LFS 78% of the time.Patients diagnosed with MDS show a wide range of symptoms, and the disease can lead to anemia, bleeding, or infection. Similarly, prognosis for patients can range from just a few months to decades; however, this population is also at a high risk (about one-third) for developing AML.
Therefore, Nazha noted that both the patient and the clinician can derive benefit from this model. “Prognosis in MDS, and oncology in general, is one of the most important things we can do because after diagnosis, the next step in treating the patient is to stage their disease or define the risk,” he said.
“That is extremely important for patients, because (explaining their prognosis) helps to set up their expectation early to help them to understand their disease and what to expect of their journey,” he added. “…For the clinicians, it is equally important because all of our guidelines and treatment recommendations are based on risk stratification, which includes low and high risk of progression to acute myeloid leukemia.”
In turn, understanding a patient’s prognosis can also affect treatment options. For example, high-risk patients are generally treated with stem cell transplant, while low-risk patients undergo treatment with fewer associated risks. However, if risk is identified inaccurately— something that occurs in one-third of patients on the IPSS-R system—then the treatment is, in turn, wrong. “If we label the disease as high risk and the disease might be lower risk, we’re changing the management of these patients, and we are now overtreating them; and vice versa, if you have a patient [who] is lower risk, but they are high risk, that becomes a problem,” Nazha said.
To improve upon the model, the researchers are gathering feedback from clinicians to incorporate more outcomes, such as quality of life, into the model and they are developing ways to update the assessment of risk in response to changing conditions, such as when new test results are available or treatments are completed.
“This project started out of a frustration voiced by many of my patients who want to know what their own risk is and how their prognosis might differ from that of other patients. We wanted to build a personalized prediction tool that can give insights about a specific outcome for a specific patient,” Nazha said in a press release. “…Improving and personalizing our prognostic models can help to delineate patients who are at higher versus lower risk—which is particularly challenging for those who fall into the intermediate range—and match them with the appropriate treatment.”
Nazha A, Komrokji RS, Meggendorfer M, et al. A personalized prediction model to risk stratify patients with myelodysplastic syndromes. In: Proceedings from the 2018 ASH Annual Meeting and Exposition; December 1-4, 2018; San Diego, California. Abstract 793.