It is a clinical truism that decisions about disease management (treatment) in individuals should be based on absolute rather than relative benefits and harms. A key decision for patients with early breast cancer, made in discussion with their clinicians, is whether to undergo a course of neoadjuvant or adjuvant systemic chemotherapy. Meta-analyses of data from multiple randomized controlled trials by the Early Breast Cancer Trialists’ Collaborative Group have provided robust estimates of the relative risk reduction associated with the major systemic chemotherapy regimen and have shown that the efficacies of neoadjuvant and adjuvant chemotherapy are similar.1 These studies have also shown that the relative risk reduction is similar over all the major clinical features known to be associated with prognosis. Consequently, the absolute benefit of adjuvant chemotherapy will vary substantially according to prior risk.
As discussed further by Stabellini et al,2 elsewhere in this issue, PREDICT (www.breast.predict.nhs.uk) is a widely used web-based tool that estimates the absolute risk of dying from breast cancer and the absolute benefit of adjuvant chemotherapy.3,4 The web tool was accessed more than 400,000 times worldwide and 70,000 times in the United States in the past 12 months (data provided by https://matomo.org). It is important, therefore, that the performance of the tool is assessed in different populations, with both discrimination and calibration being relevant performance parameters. PREDICT has been independently validated in cohorts from Canada,5 Japan,6 Malaysia,7 the Netherlands,8–10 and the United Kingdom,11,12 and has generally been shown to have good discrimination and calibration. However, it has not been validated for early breast cancer outcomes in the United States.
Stabellini et al2 used data from the National Cancer Database, which is jointly sponsored by the Commission on Cancer of the American College of Surgeons and the American Cancer Society, to perform a validation of PREDICT v2.2 on a cohort of >700,000 women with unilateral, primary, early breast cancer diagnosed between ages 25 and 85 years, from 2004 to 2012. The primary endpoint was all-cause mortality. Discrimination of the PREDICT model based on the area under the receiver operator characteristic curve was good, with values of 0.78 for 5-year survival and 0.76 for 10-year survival.
Calibration showed that PREDICT tended to underestimate survival, with a predicted 5-year survival of 84.4% compared with 89.7% observed and a 10-year predicted survival of 69.4% compared with 78.7% observed. A similar underestimation of survival was noted in subgroup analyses based on patient demographics and tumor characteristics. Some degree of underestimation of survival perhaps would be expected because the PREDICT algorithm was based on patients diagnosed from 1999 to 2003, and substantial improvements in prognosis for women with breast cancer have occurred since then.13 The authors note that the calibration in Black patients was better than in White patients, and suggest that some of the well-known disparities in breast cancer outcomes are accounted for by the variables included in the PREDICT model. However, if this were true, the underestimation of survival in White patients would be expected to be similar in Black patients.
The study has several limitations. Details regarding the type of chemotherapy used to treat the cohort were not available, and the authors assumed all women who received adjuvant chemotherapy received a second-generation chemotherapy regimen, whereas some women may have received a third-generation regimen. This difference would be expected to result in an underestimation of survival in those treated with chemotherapy. Indeed, the underestimation was greater in those treated with chemotherapy (10-year survival of 61.7% predicted vs 74.5% observed) than in those not receiving any therapy (67.8% vs 69.6%). Similarly, treatment of women with HER2-positive disease with agents such as trastuzumab was not accounted for, and this will have resulted in overestimation of mortality in these cases (10-year survival 60.8% predicted vs 80.0% observed in HER2-positive compared with 71.4% vs 77.7% in HER2-negative, respectively).
Another important limitation of this study is that the PREDICT model performance was based on all-cause mortality rather than breast cancer–specific mortality. The key output of PREDICT is the absolute benefit of systemic therapies, and this is based on the effect of therapy on breast cancer–specific mortality. Consequently, it is important that PREDICT should provide an accurate estimate of breast cancer–specific mortality. It seems likely that if the prediction of all-cause mortality is reasonable, then the breast cancer–specific mortality will also be adequate, but this assumption cannot be tested with these data.
Some have argued that tools such as PREDICT are unnecessary given the availability of genomic risk score assays that provide prognostic information in the key group of women at moderate risk. However, given that genomic risk scores are intended to provide a guide to therapeutic decision-making, it is notable that there is a paucity of evidence in which absolute risk predictions of genomic risk scores have been validated. Age at diagnosis, tumor grade, tumor size, node status, mode of detection, and estrogen receptor status are standard variables that can be used to obtain risk estimates and expected benefits of therapy at negligible cost. Although the addition of genomic risk scores to standard variables is likely to improve the performance of PREDICT, the key question is how much discrimination improves with the addition of genomic risk scores to the PREDICT algorithm and whether any reclassification of risk that occurs as a result is cost-effective.
One such study showed that the discrimination of PREDICT was only slightly improved by the most commonly used genomic risk scores, but this resulted in 5% to 18% of women who were classified as being at intermediate risk being reclassified to low or high risk.14 Both the MINDACT trial15 and the TAILORx trial16 purport to demonstrate the clinical utility of genomic risk scores in addition to clinical estimates of risk. However, clinical risk factor stratification in the MINDACT study was not optimal, and in the TAILORx trial, the clinical variables of age at diagnosis, tumor size, and tumor grade were not used to stratify patients by risk before randomization.
PREDICT was designed to aid decision-making for women with early breast cancer. No prediction model is perfect, and PREDICT is no exception, but the calibration and discrimination were within acceptable limits. The validation of the performance of the model using a large cohort of patients from the United States confirms it is a suitable decision aid for patients and oncologists in the United States.
Early Breast Cancer Trialists Collaborative Group. Comparisons between different polychemotherapy regimens for early breast cancer: meta-analyses of long-term outcome among 100,000 women in 123 randomised trials. Lancet 2012;379(9814):432–444.
Stabellini N, Cao L, Towe CW, et al. Validation of the PREDICT prognostication tool in US breast cancer patients. J Natl Compr Canc Netw 2023;21:1011–1019.
Candido Dos Reis FJ, Wishart GC, Dicks EM, et al. An updated PREDICT breast cancer prognostication and treatment benefit prediction model with independent validation. Breast Cancer Res 2017;19:58.
Wishart GC, Azzato EM, Greenberg DC, et al. PREDICT: a new UK prognostic model that predicts survival following surgery for invasive breast cancer. Breast Cancer Res 2010;12:R1.
Wishart GC, Bajdik CD, Azzato EM, et al. A population-based validation of the prognostic model PREDICT for early breast cancer. Eur J Surg Oncol 2011;37:411–417.
Zaguirre K, Kai M, Kubo M, et al. Validity of the prognostication tool PREDICT version 2.2 in Japanese breast cancer patients. Cancer Med 2021;10:1605–1613.
Wong HS, Subramaniam S, Alias Z, et al. The predictive accuracy of PREDICT: a personalized decision-making tool for Southeast Asian women with breast cancer. Medicine (Baltimore) 2015;94:e593.
de Glas NA, Bastiaannet E, Engels CC, et al. Validity of the online PREDICT tool in older patients with breast cancer: a population-based study. Br J Cancer 2016;114:395–400.
Engelhardt EG, van den Broek AJ, Linn SC, et al. Accuracy of the online prognostication tools PREDICT and Adjuvant! for early-stage breast cancer patients younger than 50 years. Eur J Cancer 2017;78:37–44.
van Maaren MC, van Steenbeek CD, Pharoah PDP, et al. Validation of the online prediction tool PREDICT v. 2.0 in the Dutch breast cancer population. Eur J Cancer 2017;86:364–372.
Maishman T, Copson E, Stanton L, et al. An evaluation of the prognostic model PREDICT using the POSH cohort of women aged 40 years at breast cancer diagnosis. Br J Cancer 2015;112:983–991.
Gray E, Marti J, Brewster DH, et al. Independent validation of the PREDICT breast cancer prognosis prediction tool in 45,789 patients using Scottish Cancer Registry data. Br J Cancer 2018;119:808–814.
Clift AK, Dodwell D, Lord S, et al. Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study. BMJ 2023;381:e073800.
Chowdhury A, Pharoah PD, Rueda OM. Evaluation and comparison of different breast cancer prognosis scores based on gene expression data. Breast Cancer Res 2023;25:17.
Cardoso F, van’t Veer LJ, Bogaerts J, et al. 70-gene signature as an aid to treatment decisions in early-stage breast cancer. N Engl J Med 2016;375:717–729.
Sparano JA, Gray RJ, Makower DF, et al. Adjuvant chemotherapy guided by a 21-gene expression assay in breast cancer. N Engl J Med 2018;379:111–121.