Article contents
Two-armed bandits with a goal, I. One arm known
Published online by Cambridge University Press: 01 July 2016
Abstract
One of two random variables, X and Y, can be selected at each of a possibly infinite number of stages. Depending on the outcome, one's fortune is either increased or decreased by 1. The probability of increase may not be known for either X or Y. The objective is to increase one's fortune to G before it decreases to g, for some integral g and G; either may be infinite.
In the current part of the paper, the distribution of X is unknown and that of Y is known. We characterize the situations in which optimal strategies exist and, for certain kinds of information concerning X and Y, we characterize optimal sequential strategies for choosing to observe X and Y.
In Part II (Berry and Fristedt (1980)), it is known that either X or Y has probability α of increasing the current fortune by 1 and the other has probability β of increasing the fortune by 1, where α and β are known, but which goes with X is not known.
Keywords
- Type
- Research Article
- Information
- Copyright
- Copyright © Applied Probability Trust 1980
References
- 8
- Cited by