Article contents
A conjecture on the Feldman bandit problem
Published online by Cambridge University Press: 28 March 2018
Abstract
We consider the Bernoulli bandit problem where one of the arms has win probability α and the others β, with the identity of the α arm specified by initial probabilities. With u = max(α, β), v = min(α, β), call an arm with win probability u a good arm. Whereas it is known that the strategy of always playing the arm with the largest probability of being a good arm maximizes the expected number of wins in the first n games for all n, we conjecture that it also stochastically maximizes the number of wins. That is, we conjecture that this strategy maximizes the probability of at least k wins in the first n games for all k, n. The conjecture is proven when k = 1, and k = n, and when there are only two arms and k = n - 1.
MSC classification
- Type
- Short Communications
- Information
- Copyright
- Copyright © Applied Probability Trust 2018
References
- 1
- Cited by