Multi-armed bandits

Online sequential-decision making via bandit algorithms, modeling considerations for better decisions (Invited Talk @ BMS-ANed)

The multi-armed bandit (MAB) framework holds great promise for optimizing sequential decisions online as new data arise. For example, it could be used to design adaptive experiments that can result in better participant outcomes and improved …

Online sequential-decision making via bandit algorithms, modeling considerations for better decisions (Seminar @ Department of Statistics, Padua University)

The multi-armed bandit (MAB) framework holds great promise for optimizing sequential decisions online as new data arise. For example, it could be used to design adaptive experiments that can result in better participant outcomes and improved …

Online sequential-decision making via bandit algorithms, modeling considerations for better decisions (Keynote Talk @ ALBECS-2024, 19th International Conference on Persuasive Technology 2024)

The multi-armed bandit (MAB) framework holds great promise for optimizing sequential decisions online as new data arise. For example, it could be used to design adaptive experiments that can result in better participant outcomes and improved …

Using Adaptive Bandit Experiments to Increase and Investigate Engagement in Mental Health

Digital mental health (DMH) interventions, such as text-message-based lessons and activities, offer immense potential for accessible mental health support. While these interventions can be effective, real-world experimental testing can further …

Modeling considerations when optimizing adaptive experiments under the reinforcement learning framework (Invited Talk @ ICSDS2023)

Artificial intelligence tools powered by machine learning have shown considerable improvements in a variety of experimental domains, from education to healthcare. In particular, the reinforcement learning (RL) and the multi-armed bandit (MAB) …

Multinomial Thompson sampling for rating scales and prior considerations for calibrating uncertainty

Bandit algorithms such as Thompson sampling (TS) have been put forth for decades as useful tools for conducting adaptively-randomised experiments. By skewing the allocation toward superior arms, they can substantially improve particular outcomes of …

On the finite-sample and asymptotic validity of an allocation-probability test for adaptively-collected data (Invited Talk @ StaTalk2023)

Response-adaptive designs, either based on simple rules, urn models, or bandit problems, are of increasing interest among both theoretical and practical communities. In particular, regret-optimising bandit algorithms like Thompson sampling hold the …

Efficient Inference Without Trading-off Regret in Bandits. An Allocation Probability Test for Thompson Sampling (Invited Talk @ JSM2023)

Using bandit algorithms to conduct adaptive randomised experiments can minimise regret, but it poses major challenges for statistical inference. Recent attempts to address these challenges typically impose restrictions on the exploitative nature of …

Multinomial Thompson Sampling for Online Sequential Decision Making with Rating Scales (Invited Seminar @ Federico II di Napoli)

Multi-armed bandit algorithms such as Thompson sampling (TS) have been put forth for decades as useful tools for optimizing sequential decision-making in online experiments. By skewing the allocation ratio towards superior arms, they can minimize …

Adaptive Experiments for Enhancing Digital Education -- Benefits and Statistical Challenges (Talk @ ICNA-STA2023)

Adaptive digital field experiments are continually increasing in their breadth of use in fields like mobile health and digital education. Using adaptive experimentation in education can help not only to explore and eventually compare various arms but …