Underperformance of Contemporary Phase III Oncology Trials and Strategies for Improvement

Authors: Changyu Shen PhD1,2, Enrico G. Ferro MD2,3, Huiping Xu PhD4, Daniel B. Kramer MD1,2, Rushad Patell MD2,5, and Dhruv S. Kazi MD, MSc, MS1,2
View More View Less
  • 1 Richard A. and Susan F. Smith Cancer Center for Outcomes Research in Cardiology, Division of Cardiology, Department of Medicine, Beth Israel Deaconess Medical Center,
  • | 2 Harvard Medical School, and
  • | 3 Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts;
  • | 4 Department of Biostatistics, School of Medicine, Richard M. Fairbanks School of Public Health, Indiana University, Indianapolis, Indiana; and
  • | 5 Division of Hematology-Oncology, Beth Israel Deaconess Medical Center, Boston, Massachusetts.

Background: Statistical testing in phase III clinical trials is subject to chance errors, which can lead to false conclusions with substantial clinical and economic consequences for patients and society. Methods: We collected summary data for the primary endpoints of overall survival (OS) and progression-related survival (PRS) (eg, time to other type of event) for industry-sponsored, randomized, phase III superiority oncology trials from 2008 through 2017. Using an empirical Bayes methodology, we estimated the number of false-positive and false-negative errors in these trials and the errors under alternative P value thresholds and/or sample sizes. Results: We analyzed 187 OS and 216 PRS endpoints from 362 trials. Among 56 OS endpoints that achieved statistical significance, the true efficacy of experimental therapies failed to reach the projected effect size in 33 cases (58.4% false-positives). Among 131 OS endpoints that did not achieve statistical significance, the true efficacy of experimental therapies reached the projected effect size in 1 case (0.9% false-negatives). For PRS endpoints, there were 34 (24.5%) false-positives and 3 (4.2%) false-negatives. Applying an alternative P value threshold and/or sample size could reduce false-positive errors and slightly increase false-negative errors. Conclusions: Current statistical approaches detect almost all truly effective oncologic therapies studied in phase III trials, but they generate many false-positives. Adjusting testing procedures in phase III trials is numerically favorable but practically infeasible. The root of the problem is the large number of ineffective therapies being studied in phase III trials. Innovative strategies are needed to efficiently identify which new therapies merit phase III testing.

Submitted September 14, 2020; final revision received November 29, 2020; accepted for publication November 30, 2021.

Published online June 21, 2021.

Author contributions: Study design: Shen. Data analysis: Shen, Xu. Data interpretation: Shen, Ferro. Writing – original draft: Shen, Ferro. Writing – review and editing: Shen, Ferro, Kramer, Patell, Kazi. Critical feedback for further interpretation of results: Shen, Ferro, Kramer, Patell, Kazi. Final approval: All authors.

Disclosures: The authors have disclosed that they have not received any financial consideration from any person or organization to support the preparation, analysis, results, or discussion of this article.

Correspondence: Changyu Shen, PhD, Smith Center for Outcomes Research, Department of Medicine, Beth Israel Deaconess Medical Center, 375 Longwood Avenue, 4th Floor, Boston, MA 02215. Email: changyushen312@gmail.com

Supplementary Materials

    • Supplemental Materials (PDF 480 KB)
  • 1.

    Wilson BE, Jacob S, Yap ML, et al. Estimates of global chemotherapy demands and corresponding physician workforce requirements for 2018 and 2040: a population-based study. Lancet Oncol 2019;20:769780.

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 2.

    Del Paggio JC, Azariah B, Sullivan R, et al. Do contemporary randomized controlled trials meet ESMO thresholds for meaningful clinical benefit? Ann Oncol 2017;28:157162.

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 3.

    Lawrence NJ, Roncolato F, Martin A, et al. Effect sizes hypothesized and observed in contemporary phase III trials of targeted and immunological therapies for advanced cancer. JNCI Cancer Spectr 2018;2:pky037.

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 4.

    Fojo T, Mailankody S, Lo A. Unintended consequences of expensive cancer therapeutics—the pursuit of marginal indications and a me-too mentality that stifles innovation and creativity: the John Conley Lecture. JAMA Otolaryngol Head Neck Surg 2014;140:12251236.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 5.

    Wong CH, Siah KW, Lo AW. Estimation of clinical trial success rates and related parameters. Biostatistics 2019;20:273286.

  • 6.

    Food and Drug Administration Amendments Act of 2007. Pub L No. 110-85, § 801, 121 Stat 823.

  • 7.

    Zarin DA, Tse T, Menikoff J. Federal human research oversight of clinical trials in the United States. JAMA 2014;311:960961.

  • 8.

    Anderson ML, Chiswell K, Peterson ED, et al. Compliance with results reporting at ClinicalTrials.gov. N Engl J Med 2015;372:10311039.

  • 9.

    Anderson ML, Peterson ED. Compliance with results reporting at ClinicalTrials.gov [letter]. N Engl J Med 2015;372:23702371.

  • 10.

    Slutsky DJ. Statistical errors in clinical studies. J Wrist Surg 2013;2:285287.

  • 11.

    Ocana A, Tannock IF. When are “positive” clinical trials in oncology truly positive? J Natl Cancer Inst 2011;103:1620.

  • 12.

    Efron B. Empirical Bayes deconvolution estimates. Biometrika 2016;103:120.

  • 13.

    Shen C. Interval estimation of a population mean using existing knowledge or data on effect sizes. Stat Methods Med Res 2019;28:17031715.

  • 14.

    Shen C, Li X. Using previous trial results to inform hypothesis testing of new interventions. J Biopharm Stat 2018;28:884892.

  • 15.

    Chen YP, Liu X, Lv JW, et al. Publication status of contemporary oncology randomised controlled trials worldwide. Eur J Cancer 2016;66:1725.

  • 16.

    Adibi A, Sin D, Sadatsafavi M. Lowering the P value threshold. JAMA 2019;321:15321533.

  • 17.

    Benjamin DJ, Berger JO, Johannesson M, et al. Redefine statistical significance. Nat Hum Behav 2018;2:610.

  • 18.

    Ioannidis JPA. The proposal to lower P value thresholds to .005. JAMA 2018;319:14291430.

  • 19.

    McShane BB, Gal D, Gelman A, et al. Abandon statistical significance. Am Stat 2019;73(Suppl 1):235245.

  • 20.

    Wasserstein RL, Schirm AL, Lazar NA. Moving to a world beyond “p < 0.05.” Am Stat 2019;73(Suppl 1):119.

  • 21.

    Wayant C, Scott J, Vassar M. Evaluation of lowering the P value threshold for statistical significance from .05 to .005 in previously published randomized clinical trials in major medical journals. JAMA 2018;320:18131815.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 22.

    Mailankody S, Prasad V. Five years of cancer drug approvals: innovation, efficacy, and costs [letter]. JAMA Oncol 2015;1:539540.

  • 23.

    Vreman RA, Belitser SV, Mota ATM, et al. Efficacy gap between phase II and subsequent phase III studies in oncology. Br J Clin Pharmacol 2020;86:13061313.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 2600 2600 2600
PDF Downloads 1753 1753 1753
EPUB Downloads 0 0 0