János Divényi

Central European University (Budapest) - Emarsys

Bias in treatment effect estimates based on multi-armed bandit experiments

Venue

IBD Salle 16

Îlot Bernard du Bois - Salle 16

AMU - AMSE
5-9 boulevard Maurice Bourdet
13001 Marseille

Date(s)

Tuesday, June 4 2019| 2:30pm

Contact(s)

Ewen Gallic: ewen.gallic[at]univ-amu.fr
Pierre Michel: pierre.michel[at]univ-amu.fr

Abstract

Multi-armed bandit algorithms are getting more and more popular in the digital world where collecting data adaptively is technically feasible. It has been demonstrated that balancing between exploration and exploitation can achieve higher average outcomes than the "experiment first, exploit later" approach of the traditional treatment choice literature. However, there is little work on how data arising from bandits can be used to estimate treatment effects (instead of just determining the arm with the highest outcome). This paper contributes to this growing literature by a systematic simulation exercise that aims to characterize the behavior of the standard average treatment effect estimator on adaptively collected data. I show that the treatment effect estimation - that results from two negatively biased means - are biased away from zero, and illustrate how this bias depends on the magnitude of the treatment effect. I also provide intuitive explanations for these phenomena. I show that propensity score weighting can even exacerbate the bias. Finally, I suggest an easy to implement modication of the propensity score weighting to improve on the estimator.

More information

János Divényi website's