Introduction
The paper describes the proposed design of a Phase I trial concurrently evaluating the safety of a hematopoietic progenitor kinase-1 inhibitor (Agent A) as a single agent and in combination with an anti-PD-1 agent in patients with advanced malignancies. This work was developed through collaboration and mentoring between the authors at the American Association for Cancer Research (AACR) and the American Society of Clinical Oncology (ASCO) annual AACR/ASCO Methods in Clinical Cancer Research Workshop. The study’s primary objective was to concurrently determine the maximum tolerated dose (MTD) of Agent A with and without anti-PD-1 therapy, among the possible study dose levels provided in Table 1. The starting dose level was 60 mg once daily. The MTD in each row of Table 1 was defined as the study dose level with a dose-limiting toxicity (DLT) rate closest to the prespecified target DLT rate of 30%. In this trial, several assumptions drive the design considerations. First, the DLT probability increases as Agent A’s dose increases (i.e., across rows of Table 1). Second, the addition of anti-PD-1 is accompanied by additional DLT-qualifying adverse events to those related to Agent A which increases the likelihood of DLT of the combination therapy relative to the monotherapy when holding the dose of Agent A constant (i.e., up columns of Table 1). Therefore, before the study, clinical information would indicate that the estimated MTD of Agent A without anti-PD1 therapy should not be below the estimated DLT with anti-PD-1 treatment.
A straightforward way to design this study would be to conduct parallel dose-finding trials to identify the MTD of Agent A both with and without anti-PD-1 therapy. In this case, a method intended for single-agent evaluation, such as the 3 + 3 algorithm, could be applied independently to each row of Table 1. Conducting independent trials for each row is a simple approach to the problem described, but it is a suboptimal way to proceed because doing so ignores the available clinical information between the two treatment strategies. This approach often leads to a reversal, a situation in which MTD recommendations violate the known order of toxicity risk by row [Reference Horton, O'Quigley and Conaway1]. Reversals are problematic because the final MTD selection for Agent A with anti-PD-1 therapy is a higher dose level than the MTD for Agent A without anti-PD-1 treatment, which is assumed to be less toxic. In this study, either the MTD in each row should be at the same dose of Agent A or the MTD with anti-PD-1 should be below the MTD without Agent A. A more efficient approach would involve adaptively borrowing information across rows of Table 1 to allow safety data from Agent A with ant-PD-1 therapy to inform dose assignments for Agent A without anti-PD-1 treatment and vice versa.
Several existing dose-finding methods for drug combinations could be applied to this problem. The sequential continual reassessment method (CRM [Reference O’Quigley, Pepe and Fisher2]) proposed by Yin and Yuan [Reference Yuan and Yin3] converts the two-dimensional dose-finding trial to a series of one-dimensional dose-finding subtrials by fixing the dose level of one drug, but they suggest conducting the subtrials sequentially. An adaptive approach to this problem was developed by Braun and Jia [Reference Braun and Jia4]. However, neither method has available software for simulation or implementation. Ivanova and Wang [Reference Ivanova and Wang5] applied bivariate isotonic regression to estimate multiple MTD in drug combination studies. An editorial in the Journal of Clinical Oncology by Mandrekar [Reference Mandrekar6] described the use of the method of Ivanova and Wang [Reference Ivanova and Wang5] in a real phase I study aiming to identify multiple MTD [Reference Gandhi, Bahleda and Tolaney7]. This method also does not have accompanying software for simulation or implementation. The product of independent beta priors design, proposed by Mander et al. [Reference Mander and Sweeting8], enables the estimation of a maximum tolerated contour, a set of dose combinations with acceptable toxicity profiles. The product of independent priors design has an R package (pipe.design) available for download on the Comprehensive R Archive Network and accompanying R shiny web application. However, this method is not explicitly geared toward finding an MTD combination in each row of a two-dimensional grid. It may recommend any number (> 2) MTD combinations for further testing in middle development. The two-dimensional design of Wages [Reference Wages9] has demonstrated good statistical properties compared to the Wang and Ivanova [Reference Ivanova and Wang5] method. The method has available R code for simulation upon request from the first author. We will use this method to illustrate the advantages of using a design that efficiently borrows safety data across the dose levels of Agent A with and without the addition of anti-PD1 therapy. The statistical properties of such an approach will be contrasted with a framework that conducts independent parallel trials using a design for single-agent evaluation, such as 3 + 3 decision rules or a Bayesian optimal interval (BOIN) method [Reference Liu and Yuan10], in each row.
Methods
A participant is classified as experiencing DLT based on protocol-specific adverse event definitions occurring during the first treatment cycle. As data accumulate, each participant is classified as to whether they experienced a DLT (yes/no). Adverse events were to be graded using the National Cancer Institute (NCI) Common Terminology Criteria (CTCAE) Version 5.0. The MTD of Agent A in each row of Table 1 is defined as the dose combination with a DLT rate closest to the target rate of 30%.
It is assumed that the MTD for Agent A in each row is either (i) at the same dose of the anti-PD-1 therapy, or (ii) the MTD for Agent A is one dose level lower with the addition of anti-PD-1 therapy than without the anti-PD-1 treatment. For instance, if the MTD dose of Agent A without anti-PD-1 therapy is estimated to be dose level 4 (480 mg). The estimated MTD of Agent A with anti-PD-1 treatment will also be dose level 4 (480 mg) or will be shifted one level lower to dose level 3 (240 mg). The truth could be either of these possibilities. We want to account for this uncertainty in the design by using the data to estimate the relative location of the MTD between rows. A similar strategy has been implemented in methods that account for patient heterogeneity [Reference O’Quigley and Iasonos11,Reference Horton, Wages and Conaway12]. The relative location of the MTD between rows is illustrated in Table 2.
MTD, maximum tolerated dose.
Estimation and Allocation
The design is based on utilizing a class of working models corresponding to relative locations (shifts) of the MTD in each row of the drug combination grid (Table 2). The uncertainty associated with the relationship between the MTD in each row is expressed through the specification of multiple one-parameter models in Table 3 that reflect different locations of the MTD of Agent A with and without anti-PD-1 therapy. Model selection techniques are used to choose the model most consistent with the observed data. A common model choice [Reference Paoletti and Kramar13] in the CRM is to raise a set of initial DLT probability estimates, also referred to as the “skeleton” of the model, to a power a that is a parameter to be estimated by the data. A different skeleton is specified for each possible MTD shift between the dose levels (i.e., shift = 0 or shift = −1). The skeleton values displayed in Table 3 for each model were generated using the algorithm of Lee and Cheung [Reference Lee and Cheung14]. Using all accumulated toxicity data, the design fits the CRM for each DLT probability working model, and the parameter a is estimated for each model by maximum likelihood estimation. The working model with the largest likelihood is chosen, and DLT probability estimates are updated for each dose level using the selected model. If there is a tie between the highest likelihood values of two or more models, then the selected model is randomly chosen from among those with tied likelihood values. The updated DLT probability estimates are used to identify the dose of Agent A in each row with a DLT rate closest to the target rate of 30%. The design would then randomize the next patient to one of the two recommended dose levels with equal probability.
MTD, maximum tolerated dose.
Getting the Trial Underway
Likelihood-based methods, like the two-stage CRM, fail to have a solution until at least one DLT and one non-DLT have been observed in the study [Reference O’Quigley and Shen15]. Thus, an initial dose escalation scheme must be defined before the study begins that will be used until one DLT and one non-DLT occur in the study. This initial design essentially amounts to prespecifying a path within the drug combination grid to adhere to get the trial underway. This process could, of course, be carried out in several ways and could include some randomization if more than one dose level could be considered for escalation, as is often the case in drug combination studies. In this design, the trial would start at the lowest dose level and escalate along the bottom row in cohorts of size one until the first DLT occurs. The design is flexible enough to accommodate other starting dose levels if necessitated in a specific trial situation. As long as no DLT is observed in participants not assigned to anti-PD-1, escalation would continue at the lowest dose level of Agent A with anti-PD-1 therapy. After the first DLT, the modeling stage begins, which proceeds according to the estimation and allocation procedure described above.
Results
Illustration of the Method
In this section, we illustrate the behavior of the method described in this article under a set of true DLT probabilities for the 2 × 7 example described in Section 2. The target DLT rate is 30%, and the total sample size is N = 39, which was chosen using the sample size calculator for CRM trials [Reference Cheung16] using the function getn in R package dfcrm [ Reference Cheung17] with the following specifications; (1) the desired average probability of correction selection (PCS) of 50%, (2) target DLT rate of 30%, (3) the number of test doses equal to 14 (i.e., 7 dose levels in 2 rows), and (4) effect size equal to 1.78. The set of true DLT probabilities for row 1 are {0.01, 0.12, 0.18, 0.21, 0.22, 0.31, 0.60}, indicating that dose level 6 (i.e., Agent A at 1200mg without a-PD-1 therapy) is the true MTD in row 1. For row 2, the true probabilities are {0.09, 0.14, 0.22, 0.25, 0.32, 0.40, 0.64}, indicating that dose level 5 (i.e., Agent A at 800 mg with a-PD-1 therapy) is the true MTD in row 2. The relative location of these true MTD combinations indicates a shift of one dose level between the two rows. The method embodies the characteristics of the CRM, so we appeal to its features in specifying design parameters. The skeleton values were chosen by the algorithm of Lee and Cheung [Reference Lee and Cheung14] and are provided in Table 3. We assumed that each working model was equally likely to represent the true shift between the MTD combinations at the beginning of the trial.
The data from the entire simulated trial are provided in Table 4. The first four eligible participants are administered escalating dose levels along row 1, and 0 DLT are observed on doses of 60, 120, 240, and 480 mg of Agent A without anti-PD-1 therapy. The first DLT occurs in participant 5 on a dose of 800 mg of Agent A without anti-PD-1, at which point the modeling stage begins. Even with these limited data, the true shift between MTD combinations is estimated to be one dose level apart. Combinations (480 mg with a-PD-1 therapy) and (800 mg without a-PD-1 therapy) are indicated to have an estimated DLT rate closest to the target rate. Patient 6 is randomized with probability 1/2 to one of these combinations, which yields a recommendation of (480 mg with anti-PD-1 therapy), on which they do not experience DLT. It is important to note that the dose-limiting probabilities are updated in row 2, even though we have yet to observe a patient in this row, illustrating the formal borrowing of information across rows afforded by the model. Overall, in this simulated trial (Table 4), N = 39 patients were treated, yielding final MTD combination recommendations of (800 mg with anti-PD-1 therapy) and (1200 mg without anti-PD-1 treatment).
DLT, dose-limiting toxicity; MTD, maximum tolerated dose.
Operating Characteristics Over a Small Set of Curves
We compared the operating characteristics of three competing approaches for identifying MTD of Agent A both with and without anti-PD-1 therapy. The first approach is to run independent 3 + 3 algorithms in each row. The second approach is to run independent BOIN designs in each row. The third approach was the CRM design for identifying multiple MTD combinations using a shift model described above.
The operating characteristics of the three methods were compared by simulating 1000 trials under six dose-toxicity cases of a 2 × 7 dose combination grid with varying positions of actual MTDs, as shown in Table 5. We randomly generated the dose–toxicity curves using the method of Conaway and Petroni [Reference Conaway and Petroni18], with the constraint that the MTD in each row are the same or one level apart. The target DLT rate that defines the MTD is set to 30% in all scenarios. The sample size is N = 39 for the CRM shift model design, and n = 20 in each row (total N = 40) for the independent BOIN designs. The sample size for the two separate 3 + 3 trials is estimated from the simulation studies. Throughout the simulation studies, participants are assigned to doses in cohorts of size one in the independent BOIN and shift model approaches. In the specification of our working models, we restrict the possible shift between MTD to be either 0 or −1, indicating that we do not allow for more than a one-dose-level shift between two adjacent rows of the matrix; these shifts represent the clinical setting in which we are operating. However, shifts of two or more are possible, and the CRM shift model can handle such possibilities by including more working models, as demonstrated in Wages [Reference Wages9]. In general, our goal is to evaluate (i) how well each method locates MTD at and around the target DLT rate in each row (percentage of correct recommendation; PCR) and (ii) how well each method allocates patients to combinations at and around the target DLT rate in each row (proportion of correct allocation; PCA), and (iii) how many times did the independent designs “reverse” the order of the MTD (i.e., chooses a higher MTD with anti-PD-1 than without anti-PD-1)? This percentage will be 0% for the CRM design since the shift model structure prevents it from happening, and thus it is not possible.
BOIN, Bayesian optimal interval; MTD, maximum tolerated dose.
Tables 5 and 6 show the operating characteristics of the three methods under the six dose-toxicity cases. Table 7 reports the proportion of trials correctly identifying neither, one, or both MTDs. In case I, the shift model method yields a higher PCR than the parallel BOIN and 3 + 3 approaches in both of the rows (40.5% vs. 33.1% vs. 14.1% in row 1; 40.0% vs. 20.3% vs. 9.2% in row 2) and an average PCR of 40.25% across both rows, compared with 11.65% for the independent 3 + 3 approach and 26.7% for the parallel BOIN approach. For patient allocation, the proposed method allocates the highest proportion of patients to the true MTD in both of the rows in case I (0.14 vs. 0.10 vs. 0.04 in row 1; 0.13 vs. 0.08 vs. 0.04 in row 2) and yields the highest overall PCA (0.27) than parallel 3 + 3 and BOIN algorithms (0.08 and 0.10, respectively). The average sample size is 31.8 for the parallel 3 + 3 design, 39.6 for the parallel BOIN designs, and 39.0 for the shift model approach in case I – the parallel 3 + 3 approach results in a reversal in 29.2% of simulated trials and the parallel BOIN approach does so in 29.3% of simulated trials.
BOIN, Bayesian optimal interval; MTD, maximum tolerated dose.
BOIN, Bayesian optimal interval.
In case 2, the shift model method yields a higher PCR than the parallel BOIN and 3 + 3 approaches in both of the rows (63.3% vs. 44.8% vs. 32.7% in row 1; 52.2% vs. 27.6% vs. 15.2% in row 2). The average PCR is 57.75% across all rows for the shift model approach, compared with 23.95% for the 3 + 3 approach and 36.2% for the BOIN approach. For patient allocation, the proposed method allocates the highest proportion of patients to the true MTD combinations in both of the rows in case 2 (0.22 vs. 0.17 vs. 0.14 in row 1; 0.16 vs. 0.12 vs. 0.07 in row 2) and yields a higher overall PCA (0.38) than parallel BOIN (0.29) and 3 + 3 algorithms (0.21). The average sample size is 26.9 for the parallel 3 + 3 design, 38.1 for the parallel BOIN design, and 39.0 for the shift model approach in case 2 – the parallel 3 + 3 approach results in a reversal in 6.5% of simulated trials and the BOIN approach does so in 10.7% of trials.
In case 3, the shift model method again produces a higher PCR than the parallel BOIN and 3 + 3 approaches in both of the rows (48.9% vs. 38.9% vs. 40.5% in row 1; 67.6% vs. 32.6% vs. 27.2% in row 2). The average PCR is 58.25% for the proposed approach across all rows, compared with 33.85% for the 3 + 3 approach and 38.9% for the BOIN approach. For patient allocation, the 3 + 3 and shift model methods allocate a similar proportion of patients to the true MTD combinations in both of the rows in case 3 (0.28 vs. 0.29 in row 1; 0.31 vs. 0.30 in row 2) and produce a similar overall PCA (0.59). The parallel BOIN approach allocates a slightly lower proportion of patients to true MTD combinations in each row. The average sample size is 14.5 for the parallel 3 + 3 design, 31.6 for the BOIN approach, and 36.7 for the shift model approach in case 1. This discrepancy is due to the aggressive nature in which the 3 + 3 stops early for safety concerns. Despite the safe doses in each row, the 3 + 3 stops early in 44.1% and 59.9% of simulated trials in rows 1 and 2, respectively. The parallel 3 + 3 approach results in a reversal in 22.1% of simulated trials and the parallel BOIN approach does so 28.3% of the time. Similar findings are reported for cases 4-6.
Operating Characteristics Over Many Curves
We also simulated the operating characteristics of the three competing approaches over numerous randomly generated dose–toxicity curves. We randomly generated 50 dose–toxicity curves using the method of Conaway and Petroni [Reference Conaway and Petroni18], with the constraint that the MTDs in each row are the same or one level apart. The curves have a variety of actual MTD locations and shapes (i.e., steep, flat). We then simulated 1000 trials under each curve and evaluated the methods on four performance metrics reported in the previous section. First, how many times did each design correctly recommend neither of the true MTDs in each row (i.e., zero MTDs correct)? Second, how many times did each design correctly recommend at least one of the true MTDs in either row (i.e., at least one MTD correct)? Third, how many times did each design correctly recommend both true MTDs in each row (i.e., both MTD right)? Fourth is the percentage of reversals generated by the independent 3 + 3 design. This percentage will be 0% for the CRM design because the shift model structure prevents it from happening. We summarize the results of the 50 curves with boxplots for each of the first three metrics and a bar plot for the fourth metric (Fig. 1).
The average sample size, estimated from simulations, was approximately 26.9 for the independent 3 + 3 designs. The maximum sample size for the CRM shift model design and the parallel BOIN design was 39 and 40 participants, respectively. The probability of correctly selecting neither of the two true MTDs was 60% for the parallel 3 + 3 design, 40% for the parallel BOIN approach, and 35% for the CRM shift model design [Figure 1(a)]. The probability of correctly selecting at least one true MTD is 40% for the parallel 3 + 3 approach, 60% for the parallel BOIN approach, and 64% for the CRM shift model approach [Figure 1(b)]. The probability of selecting both true MTDs is 5% for the independent 3 + 3 algorithms, 12.5% for the independent BOIN designs, and 38% for the CRM shift model [Figure 1(c)]. Finally, the probability that the parallel 3 + 3 approach reverses the MTD between each row is 31.5 and 0% for the CRM shift model [Figure 1(d)]. The reversal percentage is 30.3% for the parallel BOIN approach. These results are consistent with our findings in the previous section on a smaller number of dose–toxicity curves.
Conclusions
More complex research questions are being posed in early-phase oncology clinical trials, necessitating design strategies tailored to the contemporary study objectives. Rule-based methods intended for MTD-based dose-finding are inflexible and cannot account for additional complexity presented by contemporary early development trials [Reference Riviere, Le Tourneau, Paoletti, Dubois and Zohar19,Reference Iasonos and O’Quigley20]. We have described an adaptive design strategy for a proposed early-phase trial concurrently evaluating the safety of a hematopoietic progenitor kinase-1 inhibitor (Agent A) as a single agent and in combination with an anti-PD-1 agent in patients with advanced malignancies. Although the design is presented using a specific example, the approach described could be generalized to other drug combination structures. For instance, Wages [Reference Wages9] applied the methodology described in this paper in settings with four dose levels in each of two rows and six dose levels in each of three rows. These settings studied the use of multiple models representing all possible shifts between dose levels in each row. Furthermore, our approach can connect with the vast body of published designs for determining an optimal administration schedule for each agent, incorporating efficacy outcomes in the dose selection model, and considering any potential ordering of toxicity and/or efficacy in subgroups of participants (see Lin et al. [Reference Lin, Thall and Yuan21] for a recent example).
This manuscript highlights an example of a novel design application to augment future innovative design implementation and demonstrate adaptive designs’ flexibility in satisfying dynamic design conditions. The FDA and others encourage more innovative approaches [Reference Paoletti and Ezzalfani22,Reference Nie, Rubin and Mehrotra23]. The description of the general design strategy and thought process for implementation is the information that improves understanding, acceptance, and approval of novel designs [Reference Iasonos, Gönen and Bosl24,Reference Petroni, Wages, Paux and Dubois25]. Unfortunately, details of study designs often are not found on sites such as clinicaltrials.gov so that modern clinical trials lack the transparency needed to support the timely implementation of novel methods. Thus, displays of current trials that use novel methods are needed to overcome barriers of infrequent implementation of innovative design strategies in early-phase trials, so we believe this work can aid in the uptake of novel design use. In addition, given the often-lengthy timeline between study concepts and protocol completion, it is valuable to present design considerations with broad application. It is worth noting that even after study completion, journals do not require complete protocols as supplemental material for dose-finding trials, and final clinical trial publications do not have sufficient room to describe the details of novel designs. Therefore, we feel the message that novel methods are being used in clinical practice is timely and important. This support for adaptive strategies will augment efficient early-phase trial design in drug combination studies. Well-performing dose-finding methods can have a great impact on the drug development process [Reference Conaway and Petroni18].
Acknowledgments
This work is supported by the National Cancer Institute [R01CA247932 to NAW and TMB].
Competing Interests
The authors declare no potential conflicts of interest.