SummaryWe develop reinforcement learning trials for discovering individualized treatment regimens for lifethreatening diseases such as cancer. A temporal-difference learning method called Q-learning is utilized which involves learning an optimal policy from a single training set of finite longitudinal patient trajectories. Approximating the Q-function with time-indexed parameters can be achieved by using support vector regression or extremely randomized trees. Within this framework, we demonstrate that the procedure can extract optimal strategies directly from clinical data without relying on the identification of any accurate mathematical models, unlike approaches based on adaptive design. We show that reinforcement learning has tremendous potential in clinical research because it can select actions that improve outcomes by taking into account delayed effects even when the relationship between actions and outcomes is not fully known. To support our claims, the methodology's practical utility is illustrated in a simulation analysis. In the immediate future, we will apply this general strategy to studying and identifying new treatments for advanced metastatic stage IIIB/IV non-small cell lung cancer, which usually includes multiple lines of chemotherapy treatment. Moreover, there is significant potential of the proposed methodology for developing personalized treatment strategies in other cancers, in cystic fibrosis, and in other life-threatening diseases.
PurposeTumors may evade immunosurveillance through upregulation of the indoleamine 2,3-dioxygenase 1 (IDO1) enzyme. Epacadostat is a potent and highly selective IDO1 enzyme inhibitor. The open-label phase I/II ECHO-202/KEYNOTE-037 trial evaluated epacadostat plus pembrolizumab, a programmed death protein 1 inhibitor, in patients with advanced solid tumors. Phase I results on maximum tolerated dose, safety, tolerability, preliminary antitumor activity, and pharmacokinetics are reported.Patients and MethodsPatients received escalating doses of oral epacadostat (25, 50, 100, or 300 mg) twice per day plus intravenous pembrolizumab 2 mg/kg or 200 mg every 3 weeks. During the safety expansion, patients received epacadostat (50, 100, or 300 mg) twice per day plus pembrolizumab 200 mg every 3 weeks.ResultsSixty-two patients were enrolled and received one or more doses of study treatment. The maximum tolerated dose of epacadostat in combination with pembrolizumab was not reached. Fifty-two patients (84%) experienced treatment-related adverse events (TRAEs), with fatigue (36%), rash (36%), arthralgia (24%), pruritus (23%), and nausea (21%) occurring in ≥ 20%. Grade 3/4 TRAEs were reported in 24% of patients. Seven patients (11%) discontinued study treatment because of TRAEs. No TRAEs led to death. Epacadostat 100 mg twice per day plus pembrolizumab 200 mg every 3 weeks was recommended for phase II evaluation. Objective responses (per Response Evaluation Criteria in Solid Tumors [RECIST] version 1.1) occurred in 12 (55%) of 22 patients with melanoma and in patients with non–small-cell lung cancer, renal cell carcinoma, endometrial adenocarcinoma, urothelial carcinoma, and squamous cell carcinoma of the head and neck. The pharmacokinetics of epacadostat and pembrolizumab and antidrug antibody rate were comparable to historical controls for monotherapies.ConclusionEpacadostat in combination with pembrolizumab generally was well tolerated and had encouraging antitumor activity in multiple advanced solid tumors.
SummaryTypical regimens for advanced metastatic stage IIIB/IV non-small cell lung cancer (NSCLC) consist of multiple lines of treatment. We present an adaptive reinforcement learning approach to discover optimal individualized treatment regimens from a specially designed clinical trial (a "clinical reinforcement trial") of an experimental treatment for patients with advanced NSCLC who have not been treated previously with systemic therapy. In addition to the complexity of the problem of selecting optimal compounds for first and second-line treatments based on prognostic factors, another primary goal is to determine the optimal time to initiate second-line therapy, either immediately or delayed after induction therapy, yielding the longest overall survival time. A reinforcement learning method called Q-learning is utilized which involves learning an optimal regimen from patient data generated from the clinical reinforcement trial. Approximating the Qfunction with time-indexed parameters can be achieved by using a modification of support vector regression which can utilize censored data. Within this framework, a simulation study shows that the procedure can extract optimal regimens for two lines of treatment directly from clinical data without prior knowledge of the treatment effect mechanism. In addition, we demonstrate that the design reliably selects the best initial time for second-line therapy while taking into account the heterogeneity of NSCLC across patients.
This first report of immunotherapy evaluation in biochemical-only relapse ovarian cancer and of IDO1 inhibitor monotherapy in ovarian cancer found no significant difference in efficacy between epacadostat and tamoxifen. Epacadostat was generally well tolerated.
BackgroundEpacadostat is a potent inhibitor of the immunosuppressive indoleamine 2,3-dioxygenase 1 (IDO1) enzyme. We present phase 1 results from a phase 1/2 clinical study of epacadostat in combination with ipilimumab, an anti-cytotoxic T-lymphocyte-associated protein 4 antibody, in advanced melanoma (NCT01604889).MethodsOnly the phase 1, open-label portion of the study was conducted, per the sponsor’s decision to terminate the study early based on the changing melanoma treatment landscape favoring exploration of programmed cell death protein 1 (PD-1)/PD-ligand 1 inhibitor-based combination strategies. Such decision was not related to the safety of epacadostat plus ipilimumab. Patients received oral epacadostat (25, 50, 100, or 300 mg twice daily [BID]; 75 mg daily [50 mg am, 25 mg pm]; or 50 mg BID intermittent [2 weeks on/1 week off]) plus intravenous ipilimumab 3 mg/kg every 3 weeks.ResultsFifty patients received ≥1 dose of epacadostat. As of January 20, 2017, 2 patients completed treatment and 48 discontinued, primarily because of adverse events (AEs) and disease progression (n = 20 each). Dose-limiting toxicities occurred in 11 patients (n = 1 each with epacadostat 25 mg BID, 50 mg BID intermittent, 75 mg daily; n = 4 each with epacadostat 50 mg BID, 300 mg BID). The most common immune-related treatment-emergent AEs included rash (50%), alanine aminotransferase elevation (28%), pruritus (28%), aspartate aminotransferase elevation (24%), and hypothyroidism (10%). Among immunotherapy-naive patients (n = 39), the objective response rate was 26% by immune-related response criteria and 23% by Response Evaluation Criteria in Solid Tumors version 1.1. No objective response was seen in the 11 patients who received prior immunotherapy. Epacadostat exposure was dose proportional, with clinically significant IDO1 inhibition at doses ≥25 mg BID.ConclusionsWhen combined with ipilimumab, epacadostat ≤50 mg BID demonstrated clinical and pharmacologic activity and was generally well tolerated in patients with advanced melanoma.Trial registrationClinicalTrials.gov identifier, NCT01604889. Registration date, May 9, 2012, retrospectively registered.Electronic supplementary materialThe online version of this article (10.1186/s40425-019-0562-8) contains supplementary material, which is available to authorized users.
9014 Background: ECHO-202/KEYNOTE-037 is an open-label, phase 1/2 study of epacadostat (a potent and selective oral inhibitor of the immunosuppressive enzyme indoleamine 2,3-dioxygenase 1) plus pembrolizumab (E + P) in patients (pts) with advanced tumors. We report preliminary efficacy and safety outcomes for the phase 1/2 NSCLC cohort. Methods: Adult pts with prior platinum-based therapy (tx) and no prior checkpoint inhibitor tx were eligible. Phase 1 dose-escalation tx was E (25, 50, 100, 300 mg PO BID) + P (2 mg/kg or 200 mg IV Q3W); MTD was not exceeded. E (100 mg BID) + P (200 mg Q3W) tx doses were selected for phase 2 cohort expansion. Efficacy was evaluated by tumor proportion score (TPS [% viable tumor cells, PD-L1 staining]: < 50% and ≥50%) and by prior lines of tx in RECIST 1.1 evaluable pts. Safety was assessed in pts receiving ≥1 E + P dose. Results: As of 29OCT2016,43 pts (phase 1, n = 12; phase 2, n = 31) were evaluated. Median age was 65 years, 58% of pts were women, 12% were EGFR-positive, and 23% were KRAS-positive. Most pts had a history of smoking (84%), ≤2 prior lines of tx (84%), and no prior TKI tx (93%). For the 40 efficacy-evaluable pts, ORR (CR+PR) and DCR (CR+PR+SD) were 35% (14/40; 14 PR) and 60% (24/40; 10 SD), respectively. PD-L1 TPS test results were available in 28/40 efficacy-evaluable pts. ORR and DCR for pts with TPS ≥50% and ≤2 prior tx were 43% (3/7; all PR) and 57% (4/7; 1 SD), respectively; for pts with TPS < 50% and ≤2 prior tx, ORR and DCR were 35% (6/17; all PR) and 53% (9/17; 3 SD). Among the 40 efficacy-evaluable pts, 12/14 responses were ongoing (range, 1+ to 519 days) at data cutoff. PFS and biomarker analyses are ongoing. Across all 43 pts, most frequent TRAEs were fatigue (19%), arthralgia (9%), and increased AST (9%); 16% of pts had grade ≥3 TRAEs, and increased lipase (asymptomatic) was the only grade ≥3 TRAE that occurred in > 1 pt (n = 2). Two pts discontinued due to TRAEs (grade 3 increased AST, grade 2 increased ALT [n = 1]; grade 2 brain edema [n = 1]). Conclusions: E + P was generally well tolerated and associated with promising responses in pts with NSCLC. A phase 3 NSCLC study is planned. Clinical trial information: NCT02178722.
Background: Achieving optimal results following deep brain stimulation (DBS) typically involves several months of programming sessions. The Graphical User Interface for DBS Evaluation (GUIDE) study explored whether a visual programming system could help clinicians accurately predetermine ideal stimulation settings in DBS patients with Parkinson's disease. Methods: A multicenter prospective, observational study was designed that utilized a blinded Unified Parkinson's Disease Rating Scale (UPDRS)-III examination to prospectively assess whether DBS settings derived using a neuroanatomically based computer model (Model) could provide comparable efficacy to those determined through traditional, monopolar review-based programming (Clinical). We retrospectively compared the neuroanatomical regions of stimulation, power consumption and time spent on programming using both methods. Results: The average improvement in UPDRS-III scores was 10.4 ± 7.8 for the Model settings and 11.7 ± 8.7 for the Clinical settings. The difference between the mean UPDRS-III scores with the Model versus the Clinical settings was 0.26 and not statistically significant (p = 0.9866). Power consumption for the Model settings was 48.7 ± 22 μW versus 76.1 ± 46.5 μW for the Clinical settings. The mean time spent programming using the Model approach was 31 ± 16 s versus 41.4 ± 29.1 min using the Clinical approach. Conclusion: The Model-based DBS settings provided similar benefit to the Clinical settings based on UPDRS-III scores and were often arrived at in less time and required less power than the Clinical settings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.