An Empirical Analysis of Noninferiority Studies in Oncology: Are They Good Enough?

Authors: Alyson Haslam PhDa, Jennifer Gill MSa, and Vinay Prasad MD, MPHb,c,d,e
View More View Less
  • a Knight Cancer Institute,
  • | b Division of Hematology Oncology, Knight Cancer Institute,
  • | c Department of Public Health and Preventive Medicine,
  • | d Center for Health Care Ethics, and
  • | e Division of General Medicine, Department of Medicine, Oregon Health & Science University, Portland, Oregon.

Background: Noninferiority (NI) trials should help identify interventions that offer some benefit (eg, lower financial costs, more tolerable, or less invasive) without sacrificing noticeable effectiveness, and researchers should adhere to appropriate standards in the conduct and reporting of methods. This study describes the characteristics of a systematic sampling of NI studies from an updated search of recent published oncology trials. Methods: We performed a cross-sectional analysis of NI research published between 2014 and 2018 in the top 3 medical journals and top 3 oncology journals. We estimated the percentage of NI trials in oncology that report informative details of study, such as justification for conducting NI trial, justification of NI margin, analysis population, and alpha level. Results: There were 94 NI studies and 104 comparisons, and 59.6% (n=62) of comparisons declared NI. The median NI margin of comparisons reporting an odds or hazard ratio was 1.3 (1.05–3.2; n=64). Twenty-three percent (n=22) of studies did not provide a justification for conducting a NI study; 54.3% (n=51) of studies did not provide a justification of the margin they used in their study. Only approximately 46% (n=43) of comparisons used both an intention-to-treat (ITT) and per-protocol (PP) analysis, and 37.3% (n=35) of studies used a one-sided alpha level of >.025. There is notable variation in key elements of the conduct and reporting of NI trials, including the NI margin, the alpha level, and the population analyzed. Furthermore, a high number of studies do not provide justification for conducting a NI study or the margin used for determining NI. Conclusions: These results suggest that there is room for improvement in the reporting and conduct of NI trials in oncology.


Noninferiority (NI) trials are being conducted more frequently, and oncology is numerically 1 of the top 3 medical disciplines for published NI trials.1 These trials can help to establish treatment options that are less expensive, less invasive, or more tolerable, but they do not show that an intervention is superior in terms of efficacy. In fact, sometimes tested interventions are less effective than a comparator (inferior) but are still considered noninferior if they are no worse than a prespecified margin. With this in mind, researchers, physicians, and patients are left to balance an acceptable loss in efficacy for alternative advantages, such as financial, physical, or ease of administration.2

To determine NI, trials should use predetermined and acceptable NI margins, provide rigorous rationale on the use of intention-to-treat (ITT) and/or per-protocol (PP) populations, use an acceptable sample size and a valid comparator arm, and have a predetermined and valid reason for conducting such a study.3,4 Bias in these types of studies can have notable influence in the interpretation of results because it can often lead to no difference, but whether the lack of difference is because there truly is no difference or because bias has minimized the effect size is difficult to determine. For instance, NI can easily be obtained if the control arm has very poor adherence. In such a case, an experimental arm may be practically compared with doing nothing. NI studies in oncology have been anecdotally described,5,6 but in light of the increasing rate of publication in these types of studies, we seek to describe the characteristics of a systematic sampling of NI studies in oncology from an updated search of recent published oncology trials.


Search Strategy

We searched all original research articles published between 2014 and 2018 in the top 3 medical journals (New England Journal of Medicine, The Lancet, JAMA) and top 3 oncology journals (The Lancet Oncology, Journal of Clinical Oncology, JAMA Oncology), which were identified based on impact factor scores. This is similar to methods used by other groups.7 We included only articles that reported on studies that related to oncology, described NI trials, reported on a single trial, and used a randomized controlled trial design.

Data Abstraction

For each article, a set of data were abstracted, including article title, malignancy type, line of treatment, experimental and control arm treatments, type of intervention (drug, testing, radiation, behavior, or surgery), primary end point, phase of trial, randomized sample size, margin, margin justification, justification of NI design and whether the justification was inferred or explicitly stated, actual absolute or relative difference, α level, type of population used in analysis (ITT or PP), whether the NI margin was met, and study funders. Point estimates of odds ratio (OR) or hazard ratio (HR) margin that were <1 were reported as the inverse for comparison purposes. The margin justification was recoded to none given, explanation given, or value given but with no explanation of why that value was used (combined with none given for analysis). The NI justification was recoded to convenience, tolerability, expense, or none given. Whether the NI margin was met was also categorized according to the Consolidated Standards of Reporting Trials (CONSORT) results for NI trials, using the originally reported confidence intervals.810 In addition, for trials in which the absolute risk difference between intervention and control arms was zero and the results were noninferior, the CONSORT category was coded as 2. When multiple interventions or multiple control groups were analyzed, we treated them as separate results. When multiple primary outcomes were analyzed, we used the first outcome mentioned in the results. Alpha levels were transformed to an equivalent one-sided α value. Funding source was recategorized as pharmaceutical/industry, government/agency/nonprofit organization, or pharmaceutical plus another type of funding source. If the responses were not provided in the main manuscript, we searched the protocol (if available). We also checked all studies to see whether the analysis population was the same in the protocol and the actual study manuscript. All articles were reviewed by 2 authors (Haslam and Gill).

Statistical Analysis

Descriptive statistics were calculated using R version 3.5.2 (R Foundation for Statistical Computing). Because some studies had multiple interventions and multiple conclusions, depending on intervention type, some data were calculated for the total number of studies except for these variables, in which case data were calculated for the number of comparisons: NI margin, NI met, NI CONSORT, and intervention type. To test for differences in variables between the different funding categories, chi-square tests were used for categorical variables (Fisher exact test for small samples) and analysis of variance was used for continuous variables. This analysis used publicly available, nonidentifiable data and did not require Institutional Review Board approval.


All Included Studies

A total of 4,739 original articles were found in the top 3 medical and oncology journals; of these, 100 studies used an NI design, but 6 were excluded because they either were nonrandomized, represented pharmacokinetic studies, did not report NI results, or were an analysis of multiple studies. Therefore, 94 studies met our inclusion criteria and were used for the analysis on study characteristics. The total number of comparisons for NI metrics was 104 because 10 studies had multiple interventions/control groups (Figure 1).

Figure 1.
Figure 1.

Flowchart of article selection process for noninferiority analysis.

Citation: Journal of the National Comprehensive Cancer Network J Natl Compr Canc Netw 18, 2; 10.6004/jnccn.2019.7349

Of the oncology NI studies, 53.8% had a drug intervention (n=56) and 71.3% were of phase III trials (n=67; Table 1). Thirty-four percent of studies were reported in the Journal of Clinical Oncology (n=32) and 33.0% were reported in The Lancet Oncology (n=31). The cancers on which NI studies reported most commonly were as follows: breast (26.6%; n=25), colorectal (10.6%; n=10), and general/multiple (9.6%; n=9). A total of 54.3% of studies were funded by a government or nonprofit agency (n=51), whereas 44.6% were funded, fully or in part, by a pharmaceutical company or industry (n=42). Median sample size was 687 (range, 72–14,215).

Table 1.

Characteristics of Oncology NI Studies

Table 1.

Overall, 59.6% (n=62) of comparisons declared NI; among industry-sponsored comparisons it was 70.4% (n=19), among comparisons that did not have industry sponsorship it was 64.3% (n=36), and among comparisons that had both industry and nonindustry sponsorship it was 30% (n=6). One trial met NI but did not report funding.

Median NI margin of comparisons reporting an OR or HR was 1.30 (range, 1.05–3.20; n=64). For comparisons reporting an absolute difference (n=6), the median margin was 3.2 points (range, 0–5.0). For comparisons reporting a percent difference (n=33), the median was 10.0 (range, 2.5–15.0). One study used a z-score (1.96).

Of the 72.4% of studies that provided a justification for the NI design, 54.2% reported that the intervention was more tolerable (n=51). A total of 23.4% (n=22) did not provide a justification.

A total of 54.3% of studies (n=51) did not provide a justification for the margin they used in their study, whereas 31.9% (n=30) reported a clear justification in the main manuscript, and 13.8% (n=13) reported it in the protocol.

Approximately 62% of studies used a one-sided α level of ≤.025 (61.7%; n=58), whereas 29.8% (n=28) used a one-sided α level of 0.05, 6.4% (n=6) used a one-sided α level of 0.1, and 1.1% (n=1) did not report the α level they used.

Approximately 42% of studies reported use of an ITT analysis only (n=39), whereas 45.7% (n=43) of comparisons used ITT and PP analyses, 6.4% (n=6) used PP analysis only, and 3.2% (n=3) did not report the type of population analysis. Of the studies that did not report the type of analysis in the main manuscript, 3 had the analysis type indicated in the protocol (2 specified ITT and 1 specified PP). Of the 44 studies that reported the type of analysis in both the study and the protocol, 13 study protocols specified doing both ITT and PP but only reported results of either PP (n=2) or ITT (n=11); 5 performed ITT and PP analyses when the protocol specified only ITT and 1 performed PP analysis when the protocol specified ITT.

The most common NI outcomes were as follows: overall survival (17.0%; n=16), progression-free survival (17.0%; n=16), and disease-free survival (11.7%; n=11).

Drug Intervention Studies

For comparisons that used a drug intervention (Table 2) and reported an OR or HR, the median ratio was 1.23 for pharmaceutical/industry funding, 1.31 for nonpharmaceutical/nonindustry funding, and 1.25 for funding by pharmaceutical/industry plus another organization. For studies using a drug intervention, 60.9% of those funded by pharmaceutical/industry, 29.4% funded by a nonpharmaceutical/nonindustry source, and 25% funded by pharmaceutical/industry plus another organization clearly indicated their margin justification in the text or protocol (P=.06). A one-sided alpha level of ≤.025 was used by 91.3% (n=21) of studies funded by pharmaceutical/industry, 64.7% (n=11) of those funded by nonpharmaceutical/nonindustry, and 50.0% (n=6) of those funded by pharmaceutical plus another organization (P=.03). Characteristics of oncology NI studies for other intervention types stratified by funding source are presented in supplemental eTables 1–3 (available with this article at, and a list of included studies is presented in supplemental eAppendix 1.

Table 2.

Characteristics of Oncology NI Studies Using Drug Intervention, Stratified by Funding Source (2014–2018)

Table 2.


In our sample of high-impact journals, we found that although most studies declared that NI was met (59.6%), 23.4% of studies did not provide justification for performing the NI study, and only 45.7% of studies provide clear justification for the margin used in their analysis. There is also a wide range of NI margins and a considerable percentage of studies using α levels greater than the standard .05 (2-sided), suggesting the arbitrary and subjective nature of determining these values or conducting these types of studies. Furthermore, only approximately 46% of studies corroborate NI results in both ITT and PP analysis. PP analysis is important for NI trials because it adds an additional approach to validate study results. In superiority trials, in which patients cross over, ITT is the preferred population to analyze because it provides a more conservative estimate of the effect.11 However, this conservative estimate increases the likelihood of NI being met. When both ITT and PP analyses are used, one can be more confident that the intervention is noninferior. Furthermore, analysis with both ITT and PP has been suggested by several authors and has been incorporated into several guidelines on the conducting and reporting of NI trials.4,12

Our results are similar to other NI reviews that have found a low reporting of margin justification in both oncology studies and randomized controlled trials in general.4,6 Median margins were also comparable to a previous study,5 but the margins in our study had a very wide range, with 5 studies using an HR margin >2.0. Under the classic method of determining appropriate margin sizes for equivalence, response rate is factored into the margin size, but generally, margin size should be <20%.13 The median margin size for studies using an HR was >20%, suggesting widespread use of overly liberal margin sizes. These results of high HR margins, in light of nearly 28% of studies that met NI having a point estimate >1, are concerning because they indicate that practitioners are willing to take the risk of less efficacy and, in some cases, a rather significant risk to achieve some other sort of benefit. In the 5 studies with an HR margin >2.0, no clear justification was given for the margin chosen, but the justification for conducting the NI study was that either less treatment was given or there were fewer postoperative complications.

The high NI margins are especially concerning when considering that the most common outcomes were overall and progression-free survival. Justifying less expense or better convenience becomes much more difficult when the trade-off is reduced survival or poorer health.

Even among studies that do report justification for the margin used, the margin is sometimes uninformative because of other biases in the study. For example, the choice of what to use in the control arm of an NI trial needs to be based on validated reasoning; otherwise, it is difficult to determine whether both drugs were either similarly effective or similarly ineffective.14 Use of controls that are not validated or controls approved on prior NI trials weakens the validity of the effectiveness of future drugs that may be approved with these types of studies because the combined difference in multiple trials (ie, bio-creep) may be undesirably high.15

Our findings show a slightly higher prevalence of oncology trials not reporting a justification for performing an NI study compared with a previous report (23% vs 17%).5 This prior study included trials that were published between 2001 and 2011, whereas ours were published between 2014 and 2018. Whether this difference is due to study search methods and the methodology of data extraction or fewer studies reporting justification in more recent years is unknown. We did see a higher percentage of NI studies not reporting justification in 2017 than in 2014, but the percentage was lowest in 2018.

Interestingly, for several of the reporting variables we examined, studies funded by industry tended to do better, possibly because it is common for these studies to be written in part by professional industry-paid writers who are familiar with elements of study reporting.16

Our study has several limitations. One is that the results may not be generalizable to NI trials as a whole. Oncology NI trials may be different from NI trials in other medical disciplines, such as cardiology, partly because of the more frequent use of composite outcomes in cardiology, which include softer endpoints and affect the NI margin required.17 Another limitation is that we only used information reported in the study, and if authors did not provide information or justification, even if there was unreported justification, we may not have categorized these studies correctly. The larger issue with this is that certain elements of the methods are important for interpreting the context of the study’s results and are openly recommended as key elements in the conducting and reporting of studies.18,19 Another limitation is that publication bias may have influenced our findings. We limited included articles to those published in the top 3 medical and top 3 oncology journals, which, although reducing the number of articles we analyzed, theoretically should have resulted in higher-quality studies and, consequently, more favorable outcomes. Finally, we do not know what the journal editors and reviewers required for publication of the NI trial, which may have influenced how data were reported (eg, ITT or PP analysis). Our analysis is focused on the reporting of the trial, but we did review the protocol for information not provided in the main manuscript.


Among NI studies in oncology, our findings showed that 28.0% do not provide adequate justification for conducting a NI study, and 68.1% do not provide adequate justification for the NI margin. Most comparisons, however, conclude that NI is met (59.6%). There is variation in the margin used to determine NI and the α level used in these types of analyses. These results suggest that there is room for improvement in the reporting of NI trials and in the conduct of these trials.


  • 1.

    Suda KJ, Hurley AM, McKibbin T, et al. . Publication of noninferiority clinical trials: changes over a 20‐year interval. Pharmacotherapy 2011;31:833839.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 2.

    Gyawali B, Kesselheim AS. US Food and Drug Administration approval of new drugs based on noninferiority trials in oncology: a dangerous precedent? JAMA Oncol 2019;5:607608.

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 3.

    Mounsey A, Viera AJ, Dominik R. 7 questions to ask when evaluating a noninferiority trial. J Fam Pract 2014;63:E48.

  • 4.

    Rehal S, Morris TP, Fielding K, et al. . Non-inferiority trials: are they inferior? A systematic review of reporting in major medical journals. BMJ Open 2016;6:e012594.

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 5.

    Riechelmann RP, Alex A, Cruz L, et al. . Non-inferiority cancer clinical trials: scope and purposes underlying their design. Ann Oncol 2013;24:19421947.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 6.

    Tanaka S, Kinjo Y, Kataoka Y, et al. . Statistical issues and recommendations for noninferiority trials in oncology: a systematic review. Clin Cancer Res. 2012;18:18371847.

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 7.

    Hwang TJ, Gyawali B. Association between progression-free survival and patients’ quality of life in cancer clinical trials. Int J Cancer 2019;144:17461751.

  • 8.

    Moher D, Hopewell S, Schulz KF, et al. . CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ 2010;340:c869.

  • 9.

    Aberegg SK, Hersh AM, Samore MH. Empirical consequences of current recommendations for the design and interpretation of noninferiority trials. J Gen Intern Med 2018;33:8896.

    • Crossref
    • PubMed
    • Search Google Scholar
    • Export Citation
  • 10.

    Piaggio G, Elbourne DR, Altman DG, et al. . Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement. JAMA 2006;295:11521160.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 11.

    Dunn DT, Copas AJ, Brocklehurst P. Superiority and non-inferiority: two sides of the same coin? Trials 2018;19:499.

  • 12.

    Snapinn SM. Noninferiority trials. Curr Control Trials Cardiovasc Med 2000;1:1921.

  • 13.

    Chow S, Song F. On selection of margin in non-inferiority trials. J Biom Biostat 2016;7:301.

  • 14.

    Burotto M, Prasad V, Fojo T. Non-inferiority trials: why oncologists must remain wary. Lancet Oncol 2015;16:364366.

  • 15.

    Murthy VL, Desai NR, Vora A, et al. . Increasing proportion of clinical trials using noninferiority end points. Clin Cardiol 2012;35:522523.

  • 16.

    Gøtzsche PC, Hróbjartsson A, Johansen HK, et al. . Ghost authorship in industry-initiated randomised trials. PLoS Med 2007;4:e19.

  • 17.

    Head SJ, Kaul S, Bogers AJ, et al. . Non-inferiority study design: lessons to be learned from cardiovascular trials. Eur Heart J 2012;33:13181324.

  • 18.

    European Medicines Agency. Committee for Medicinal Products for Human Use. Guideline on the Environmental Risk Assessment of Medicinal Products for Human Use. Doc. Ref. EMEA/CHMP/SWP/4447/00 corr 2. Available at: Accessed November 3, 2019.

  • 19.

    Schumi J, Wittes JT. Through the looking glass: understanding non-inferiority. Trials 2011;12:106.

Submitted July 5, 2019; accepted for publication August 27, 2019

Author contributions: Study concept: Haslam, Prasad. Data acquisition and analysis: Haslam, Gill. Manuscript preparation: All authors.

Disclosures: Dr. Prasad has disclosed that he receives royalties from his book Ending Medical Reversal; his work is funded by the Laura and John Arnold Foundation; he has received honoraria for grand rounds/lectures from several universities, medical centers, and professional societies, and payments for contributions to Medscape; and he hosts the podcast “Plenary Session,” which has Patreon backers. The remaining authors have disclosed that they have not received any financial consideration from any person or organization to support the preparation, analysis, results, or discussion of this article.

Correspondence: Alyson Haslam, PhD, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, OR 97239. Email:

Supplementary Materials

  • View in gallery

    Flowchart of article selection process for noninferiority analysis.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 1248 310 24
PDF Downloads 405 210 30
EPUB Downloads 0 0 0