Ewen Gallic : ewen.gallic[at]univ-amu.fr
Pierre Michel : pierre.michel[at]univ-amu.fr
In this paper we study a treatment allocation problem with multiple treatments, in which the individuals to be treated arrive sequentially. The goal of the policy maker is to treat every individual as well as possible. Which treatment is "best" is allowed to depend on various characteristics (functionals) of the individual-specific outcome distribution of each treatment. For example measures of welfare, inequality, or poverty. We propose the Functional Sequential Allocation policy, and provide upper bounds on the regret it incurs compared to the oracle policy that knows the best treatment for each individual. These upper bounds increase sublinearly in the number of treatment assignments and we show that this regret guarantee is minimax optimal. In addition, the assumptions under which this regret guarantee is established are as weak as possible --- even a minimal weakening of them will imply non-existence of policies with regret increasing sub-linearly in the number of assignments. Furthermore, we provide an upper bound on the number of suboptimal assignments made by the FSA policy and show that every policy must make a number of suboptimal assignments at least of this order.