Search

3 results

Randomised allocation of treatments in sequential trials
John Bather
Journal:

Advances in Applied Probability / Volume 12 / Issue 1 / March 1980

Published online by Cambridge University Press:

01 July 2016, pp. 174-182

Print publication:

March 1980
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Given a finite number of different experiments with unknown probabilities p1, p2, ···, pk of success, the multi-armed bandit problem is concerned with maximising the expected number of successes in a sequence of trials. There are many policies which ensure that the proportion of successes converges to p = max (p1, p2, ···, pk), in the long run. This property is established for a class of decision procedures which rely on randomisation, at each stage, in selecting the experiment for the next trial. Further, it is suggested that some of these procedures might perform well over any finite sequence of trials.

On a class of non-Markov decision processes
K. D. Glazebrook
Journal:

Journal of Applied Probability / Volume 15 / Issue 4 / December 1978

Published online by Cambridge University Press:

14 July 2016, pp. 689-698

Print publication:

December 1978
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The optimal strategy for a class of non-Markov decision processes is characterised and has the property that changes of action may occur between successive transitions of the process. Results are given which enable the optimal strategy to be computed iteratively.

On Bayesian models in stochastic scheduling
J. C. Gittins, K. D. Glazebrook
Journal:

Journal of Applied Probability / Volume 14 / Issue 3 / September 1977

Published online by Cambridge University Press:

14 July 2016, pp. 556-565

Print publication:

September 1977
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The D.A.I. theorem of Gittins and Jones has proved a powerful tool in solving sequential statistical problems. A generalisation of this theorem is presented. This generalisation enables us to solve certain stochastic scheduling problems where the items or jobs to be scheduled have random times to completion, the random times having distributions dependent upon parameters to which prior distributions are allocated. Such problems are of interest in many areas where scheduling is important.