June 2016

The multi-armed bandit problem and the design of Bayesian adaptive clinical trials: recent advances and remaining challenges

Sofia Villar (MRC Biostatistics Unit, Cambridge)

Venue: Room A54, Postgraduate Statistics Centre, Lancaster University

Date: Thursday 02-06-2016, 4 - 5pm

The "classic" multi-armed bandit problem (MABP) is an idealized mathematical decision framework for deciding how to optimally allocate a scarce resource (e.g. patients) among a number of competing uses (e.g. alternative treatments), given that such allocation is to be done sequentially and under randomly evolving conditions. Its roots date back to work produced by Thompson in 1933 but since the first publication of the optimal solution of the classic MABP in the early 1970s by a dynamic index rule in work by Gittins and Jones, the bandit literature quickly diversified and emerged as an active research topic. Across this literature, the use of bandit models to optimally design clinical trials became a typical motivating application, yet little of the resulting theory has ever been used in the actual design and analysis of clinical trials.

In recent years, the potential use of adaptive designs that use accrued data to improve clinical trials has attracted much attention both from methodologists and practitioners. Patient allocation rules based on the MABP can be considered as a Bayesian adaptive design. In this talk I will present recent results that illustrate the reasons why the implementation of this approach in practice could be desirable and in which circumstances. I will describe some recent modifications to the original method that aim at overcoming specific practical limitations to its use and I will discuss the remaining methodological challenges that need to be addressed.