Background
In time-to-event analyses, participants are censored when information on the outcome of interest is not available because the participants are no longer seen in follow-up. Common causes of censoring are withdrawal of consent or loss to follow-up (WCLFU), which both result in censoring and therefore similarly impact the interpretation of results. An increasing number of FDA approvals in oncology are based on trials in which the primary endpoint is a surrogate such as progression-free survival.1 Surrogate endpoints can be used as substitutes for definitive endpoints such as overall survival (OS) because they generally require smaller sample sizes and/or shorter follow-up times due to higher event rates. Censoring may be more prevalent with surrogate endpoints because assessment of these endpoints is dependent on protocol-mandated follow-up, such as regular imaging. In contrast, OS can be determined from medical records and registry data, even if a patient elects not to attend further follow-up visits.
Censoring is referred to as “informative” when the reasons for censoring are related to the study intervention,2 and this can introduce postrandomization bias. Early drug discontinuation (EDD), WCLFU, or initiation of a new anticancer therapy before documenting an event of interest can all result in informative censoring.3 Differential censoring between the experimental and control arms may also introduce bias, especially if it results in differences in patient characteristics between those who remain in a study and those who do not. For example, if an investigational agent causes substantial toxicity, a trial participant may be unable to attend scheduled follow-up and instead may withdraw consent, at which point the participant is censored, and subsequent events will not be captured. Therefore, only patients who can tolerate therapy are assessed for the outcome, biasing the results in favor of a treatment effect from the investigational drug. Prior studies have shown that progression rates differ between whose who are censored and those who remain in a trial.2,4
Censoring rules vary between trials, and there are no defined standards for oncology trials. Although the FDA guidelines specify that censoring rules should be defined clearly and their impact explored through sensitivity analyses, there are no prescriptive guidelines on appropriate censoring rules or the required sensitivity analyses.5 Case studies have shown that by varying the censoring assumptions, the trial conclusions can change.6,7 Modeling has also shown that failure to account for informative censoring in phase II8 and III9 studies in oncology may bias results. Despite this, the magnitude of the problem of informative censoring, EDD, and WCLFU in oncology is not well quantified.
The aim of this study was to describe the planned handling and reporting of censoring in oncology trials supporting FDA approval and to quantify and compare WCLFU and EDD in experimental and control arms. We hypothesized that censoring rules vary across oncology trials and that differences exist in WCLFU and EDD between the experimental and control arms.
Methods
Study Inclusion Criteria and Data Abstraction
We searched the FDA archives10 to identify randomized controlled trials (RCTs) supporting drug approvals by the FDA for solid organ malignancies (excluding lymphoma) from January 2015 through December 2019. We then identified the primary publication associated with each approval by searching MEDLINE. For each included RCT, we extracted the following data: tumor site, year of approval, study phase, number of patients in each arm, randomization ratio, hazard ratio for the primary outcome, class of drug (grouped into immunotherapy, chemotherapy, monoclonal antibodies, tyrosine kinase inhibitors, androgen receptor blockers, targeted therapies [including PARP inhibitors, CDK 4/6 inhibitors, mTOR inhibitors, and antibody–drug conjugates], and others), and treatment intent (curative/adjuvant/neoadjuvant vs palliative). Trials examining biosimilars, noninferiority trials, and single-arm or noncomparative trials were excluded. In the event of multiple publications on the same study, the initial publication supporting FDA approval was used for data extraction.
Ascertainment of Censoring Plans and Reporting of Censoring Rates
We examined the protocol and supplemental appendices of each study to determine whether the censoring plan was reported. In the absence of any defined standards in oncology, we considered censoring rules sufficient if there was unequivocal description of the handling of participants starting new anticancer therapy, participants with ≥2 missed outcome assessments, whether clinical progression was considered a censoring event, and whether noncancer death was considered a censoring event in the case of surrogate endpoints. We then tabulated the total number of patients censored in the experimental and control groups for each reported outcome and calculated the difference in censoring rates between the 2 groups.
Ascertainment of WCLTFU and EDD
We used 2 definitions of censoring in the context of EDD. First was a conservative estimate including only patients clearly reported as WCLFU. However, in oncology studies, if a patient is not treated, discontinues a drug early, or starts a different therapy before developing the outcome of interest, that patient is often censored.3 Therefore, we included a second, broader definition including any causes of EDD other than progression or death.
The total number of WCLFU was extracted from the study CONSORT diagrams depicting causes of EDD. If this was not reported clearly in the CONSORT diagram, the article and supplemental tables were reviewed. Patients lost to follow-up because of death were not included in WCLFU group. If WCLFU was not reported clearly for the group or subgroup on which the approval was based, the study was excluded. If the study had more than 2 arms, WCLFU was extracted for the arm supporting drug approval and the respective control group. Randomized participants who WCLFU before treatment initiation were included in WCLFU group. For the studies that included separate information on WCLFU at the time of the final analysis, this information was also collected.
From the CONSORT diagram, we tabulated the causes of EDD as objective progression (as defined by the study’s primary endpoint), investigator-determined progression, clinical progression (without radiographic confirmation), death, adverse effects (AEs) of any cause, and other (including WCLFU, clinician or patient decision to cease therapy, protocol deviation, nonadherence, and any other reasons not meeting the other definitions). If the study included a combination treatment (eg, investigational agent combined with standard-of-care agent), and if data were presented separately for discontinuation of the 2 drugs, we extracted the higher of the 2 discontinuation numbers. Randomized participants who stopped therapy for any reason (apart from death or progression) were included in the EDD definition.
Statistical Analysis
We compared the proportion WCLFU and EDD between the experimental and control arms by using a generalized estimating equation, assuming an independent correlation matrix and using robust standard errors. This method accounted for correlated data within studies, because WCLFU and EDD for the experimental and control arms within any given trial are more similar than the WCLFU and EDD between trials due to trial-specific factors (eg, trial design, type of cancer, inclusion criteria). We also calculated the mean percentage differences between the 2 groups. We performed univariable logistic regression to determine trial factors associated with WCLFU in the control group greater than or equal to WDLFU in the experimental group, and EDD in the control group greater than or equal to EDD in the experimental group. Multivariable analyses were not planned, because there were insufficient outcome data to fit a multivariable model adequately. All analyses were performed using STATA, version 12.0 (StataCorp LP). Statistical significance was defined as P<.05. No corrections were applied for multiple significance testing.
Results
Trial Selection and Characteristics
In January 2015 through December 2019, there were 125 unique FDA approvals for solid tumor malignancies based on 131 studies. Fifty studies were excluded, leaving 81 studies in the final analysis (Figure 1). Characteristics of the 81 included studies are reported in supplemental eTable 1 (available with this article at JNCCN.org).
Schema for study inclusion.
Abbreviation: WCLFU, withdrawal of consent or lost to follow-up.
Citation: Journal of the National Comprehensive Cancer Network 19, 12; 10.6004/jnccn.2021.7015
Censoring Rules
A summary of censoring rules, WCLFU, and EDD is shown in Table 1. All included studies were analyzed according to the intention-to-treat method. Censoring rules were defined adequately in 59% (n=48). Although 50% (n=41) of included studies censored patients upon initiation of a new anticancer therapy if their disease had not already progressed, 20% (n=16) did not, and 30% (n=24) provided insufficient information. Almost half (47%; n=38) of studies censored patients if they had missed ≥2 scheduled assessments, even if there was confirmed progression or death thereafter. All studies censored patients at final analysis if they remained in the study but had not yet experienced the outcome of interest. In one study in which the primary endpoint was metastasis-free survival (MFS), patients were censored if they died before developing metastasis. In all other studies with a surrogate endpoint, death of any cause was considered an event. Most studies (67%; n=54) described a planned sensitivity analysis regarding censoring rules. However, a few studies (3.7%; n=3) presented sensitivity analysis in the primary publication.
Information Reported on WCLFU, EDD, and Informative Censoring
Fourteen studies (17%) presented censoring rates over time (supplemental eTable 2). Detailed information regarding the causes of censoring was not provided, making further interpretation of these data difficult.
WCLFU and EDD
Among the 81 included studies, 69 provided sufficient information on causes for EDD (Table 2). Among these studies, the proportion of patients with EDD due to WCLFU was numerically higher in the control arm than in the experimental in 51% of studies (n=35), equal in 25% (n=17), and less than the experimental arm in 25% (n=17) (mean proportion, 3.9% in control vs 2.5% in experimental; β-coefficient, −2.2; 95% CI, −3.1 to −1.3; P<.001). Although the mean difference in the proportion WCLFU between the experimental and control arms was small (1.4%), the range was broad (−29.2 to 4.3), with a number of studies (n=8; 11.5%) having WCLFU >5% higher in the control group than in the experimental group (supplemental eFigure 1). The proportion of patients not treated after randomization was small in both arms (mean, 0.7% in experimental and 2.1% in control; β-coefficient, 1.8; 95% CI, 1.53–1.99; P<.0001), but the proportion was higher in the control arm in 51% of studies. The proportion with EDD for any reason was not statistically different between the experimental and control arms (mean, 21.6% vs 19.9%; β-coefficient, 0.27; 95% CI, −0.32 to 0.87; P=.37). The proportion of patients discontinuing treatment early due to death was similar between the control and experimental arms (P=.14). The proportion of patients discontinuing therapy early due to AEs was higher in the experimental arm (mean, 13.2% vs 8.5%; β-coefficient, 1.5; 95% CI, 0.57–2.45; P=.002).
Comparison of WCLFU and Causes of EDD Between Experimental and Control Arms (n=69)
The odds ratio (OR) of WCLFU in the control group being greater than or equal to WCLFU in the experimental group was significantly higher in studies with an active control group than in those with a placebo control group (OR, 10.1; P<.001) (Table 3). There was also a numerical increase in the odds of WCLFU in the control group being greater than or equal to the WCLFU in the experimental group in studies that were open label (OR, 3.00; P=.08), although this did not meet statistical significance. Studies with an active control group had higher odds of EDD in the control group being greater than or equal to EDD in the experimental group (OR, 8.5; P=.007) (supplemental eTable 3), and open label studies had higher odds of EDD in the control group being greater than or equal to EDD in the experimental group (OR, 7.6; P<.001).
Trial Characteristics Predicting WCLFU, Ctrl ≥ Exp
Thirteen studies (16%) provided data regarding WCLFU at the time of final analysis (supplemental eTable 4). On average, the proportion of patients no longer in study follow-up due to WCLFU at the time of final analysis was 5.8% (range, 0.7%–11.8%) in the experimental group and 8.5% (range, 1.1%–19.2%) in the control group (mean difference, −2.7%; range, −13.2% to 6.4%). The proportion in the control group was greater than or equal to the proportion in the experimental group in 12 (92%) of 13 studies.
Discussion
WCLFU and EDD for AEs differed significantly between the experimental and control arms in oncology trials supporting FDA drug approvals in 2015 through 2019. Although the mean absolute difference in the proportion WCLFU between the experimental and control arms was small (1.4%), the range was broad, with >10% of studies having WCLFU in the control arm that was >5% higher than in the experimental arm. The larger the difference in WCLFU between the experimental and control arms, the greater the potential for postrandomization bias. In such studies, sensitivity analysis that varies the outcomes among censored patients should be presented to test the robustness of the results.
The proportion of patients discontinuing therapy early for any cause (other than progression or death) was higher in the experimental group in 61% of studies, driven by the higher proportion of patients stopping treatment in the experimental arm due to AEs. This is in keeping with prior research examining FDA approvals between 2000 and 201011 demonstrating higher rates of EDD for AEs in the experimental group. These differential rates of EDD for AEs between the experimental and control groups are also potential sources of bias if subsequent events are not captured by the trial.
Whether censoring results in overestimation or underestimation of benefit of investigational therapy depends on the magnitude of differential censoring and its causes. For example, if a patient treated with an investigational agent experiences an AE resulting in EDD, an alternative drug may be initiated. In most studies, initiating nonprotocol therapies resulted in censoring, and subsequent events would not be captured. This could bias the results in favor of an investigational agent, especially because patients who are censored have higher rates of progression than those who remain in a trial.2,4 In contrast, a patient may enroll in a study in the hope of receiving treatment with a promising investigational agent. If the study is open label, participants randomized to the control group may withdraw consent, at which point they would be censored, and subsequent progression would not be captured. Our data suggest 3-fold higher odds of WCLFU being higher in the control group in open-label studies than in blinded studies.
Of concern is that only 59% of studies provided sufficient information regarding the censoring rules used in their analysis. This leaves clinicians to infer the potential bias introduced by informative censoring. Among the minority of studies that clearly presented the proportions of patients censored in each arm over time, it was, on average, higher in the experimental group than in the control group for all analyzed endpoints at final analysis. This is not surprising, because the number of censored participants who remain event-free at the time of final analysis is expected to be higher in the experimental arms of trials supporting registration of effective drugs and does not introduce bias. However, early censoring of patients due to WCLFU or other protocol-defined censoring criteria may introduce postrandomization bias, especially if this is not balanced between the experimental and control groups. This highlights the importance of reporting the reasons for censoring in clinical trials.
Although 67% of studies proposed a sensitivity analysis of censoring rules, few (n=3) reported the results of these analyses. For example, darolutamide gained approval based on improvement in MFS among men with nonmetastatic castration-resistant prostate cancer. In the experimental group, 23.1% of patients had an MFS event compared with 39% in the control group, and the remaining patients were censored at the time of final analysis. Early censoring (typically due to death before metastasis, WCLFU, or initiation of new anticancer therapy) occurred in 20.2% of patients in the control group compared with 6.4% in the experimental group, suggesting informative censoring. Although not all cases of informative censoring impact the interpretation of trial results, they can alter the magnitude of expected benefits and change the number needed to treat to observe benefit. This could then alter decision-making related to the balance between benefit and risk, and could also impact cost–benefit analyses required by healthcare payers.
Several approaches to managing WCLFU and informative censoring have been proposed.3 These include using endpoints less susceptible to censoring bias, such as OS; using methods to improve retention in trials; applying the intention-to-treat principle even to patients who discontinue the study intervention; improving the transparency of reporting of censoring in trials; and performing sensitivity analyses for best case and worst case scenarios among censored patients. In light of the high proportion of included RCTs that provided insufficient information to determine causes and rates of censoring and their impact, we suggest several additional recommendations to improve transparency and data availability in oncology trials (Table 4).
Goals and Recommendations to Improve Transparency and Reporting of WCLFU, EDD, and Censoring Information
This study has limitations. First, only 85% of trials presented data sufficient to be included in the analysis of EDD and WCLFU, and the exclusion of 15% of trials could introduce bias in our results. However, the finding that 15% of studies presented information on causes of EDD insufficient to be included is itself a concern. Even among trials that did report causes of EDD, WCLFU was not always clearly reported and may have been included in the EDD category “other.” Most studies presented WCLFU as a cause of EDD in their CONSORT diagrams. However, if a patient first discontinued for another reason (eg, an AE) and was then WCLFU, this would not be captured in the CONSORT diagram. We present data for WCLFU at the time of final analysis; however, this information was only available for 13 studies. For these reasons, our estimates of WCLFU should be viewed as conservative. Second, because only 81% of studies included a readily available protocol or statistical plan for review, we could not determine the censoring rules used in 19% of the included studies, and this missing information may also introduce bias in our analysis. Third, because we only examined studies resulting in FDA approval, this study may not capture the full breadth of censoring issues that occur in oncology trials, particularly in studies that are not used to support drug registration. Fourth, in phase II studies, safety EDD may be included in the composite primary outcome measure. However, because our eligibility criteria included only studies with time-to-event primary endpoints, this did not apply to our small cohort (n=9) of phase II trials. Finally, our study is limited by the relatively small numbers of studies included, and, as a result, we were unable to perform multivariable analyses.
Conclusions
In oncology studies supporting FDA approval, there are significant differences in the rates of WCLFU and EDD for AEs between the experimental and control arms, which could introduce postrandomization bias. This study provides objective evidence of the need to report censoring in a more transparent manner and to report sensitivity analysis using alternative censoring rules. This will improve clarity for clinicians and patients when making treatment decisions and for payers making reimbursement decisions.
References
- 1.↑
Beaver JA, Howie LJ, Pelosof L, et al. A 25-year experience of US Food and Drug Administration accelerated approval of malignant hematology and oncology drugs and biologics: a review. JAMA Oncol 2018;4:849–856.
- 2.↑
Ranganathan P, Pramesh CS. Censoring in survival analysis: potential for bias [letter]. Perspect Clin Res 2012;3:40.
- 3.↑
Templeton AJ, Amir E, Tannock IF. Informative censoring—a neglected cause of bias in oncology trials. Nat Rev Clin Oncol 2020;17:327–328.
- 4.↑
Stone AM, Bushnell W, Denne J, et al. Research outcomes and recommendations for the assessment of progression in cancer clinical trials from a PhRMA working group. Eur J Cancer 2011;47:1763–1771.
- 5.↑
U.S. Food and Drug Administration. Clinical trial endpoints for the approval of cancer drugs and biologics: guidance for industry. Accessed April 1, 2020. Available at: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/clinical-trial-endpoints-approval-cancer-drugs-and-biologics
- 6.↑
Templeton AJ, Ace O, Amir E, et al. Influence of censoring on conclusions of trials for women with metastatic breast cancer. Eur J Cancer 2015;51:721–724.
- 7.↑
Prasad V, Bilal U. The role of censoring on progression free survival: oncologist discretion advised. Eur J Cancer 2015;51:2269–2271.
- 8.↑
Campigotto F, Weller E. Impact of informative censoring on the Kaplan-Meier estimate of progression-free survival in phase II clinical trials. J Clin Oncol 2014;32:3068–3074.
- 9.↑
Denne JS, Stone AM, Bailey-Iacona R, et al. Missing data and censoring in the analysis of progression-free survival in oncology clinical trials. J Biopharm Stat 2013;23:951–970.
- 10.↑
U.S. Food and Drug Administration. Hematology/Oncology (cancer) approvals & safety notifications [updated January 9, 2020]. Accessed April 2020. Available at: https://www.fda.gov/drugs/resources-information-approved-drugs/hematologyoncology-cancer-approvals-safety-notifications
- 11.↑
Niraula S, Seruga B, Ocana A, et al. The price we pay for progress: a meta-analysis of harms of newly approved anticancer drugs. J Clin Oncol 2012;30:3012–3019.