Hostname: page-component-78c5997874-lj6df Total loading time: 0 Render date: 2024-11-10T13:39:37.747Z Has data issue: false hasContentIssue false

On a reduction principle in dynamic programming

Published online by Cambridge University Press:  01 July 2016

K. D. Glazebrook*
Affiliation:
University of Newcastle upon Tyne
*
Postal address: Department of Statistics, University of Newcastle upon Tyne NE1 7RU, UK

Abstract

Whittle enunciated an important reduction principle in dynamic programming when he showed that under certain conditions optimal strategies for Markov decision processes (MDPs) placed in parallel to one another take actions in a way which is consistent with the optimal strategies for the individual MDPs. However, the necessary and sufficient conditions given by Whittle are by no means always satisfied. We explore the status of this computationally attractive reduction principle when these conditions fail.

Type
Research Article
Copyright
Copyright © Applied Probability Trust 1988 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bergman, S. W. and Gittins, J. C. (1985) Statistical Methods for Pharmaceutical Research Planning. Marcel Dekker, New York.Google Scholar
Deshmukh, S. D. and Chikte, S. D. (1977) Dynamic investment strategies for a risky R and D project. J. Appl. Prob. 14, 144152.CrossRefGoogle Scholar
Gittins, J. C. (1979) Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148177.Google Scholar
Gittins, J. C. and Glazebrook, K. D. (1977) On Bayesian models in stochastic scheduling. J. Appl. Prob. 14, 556565.CrossRefGoogle Scholar
Glazebrook, K. D. (1976) Stochastic scheduling with order constraints. Int. J. Syst. Sci. 7, 656666.CrossRefGoogle Scholar
Glazebrook, K. D. (1978) Some ranking formulae for alternative research projects. Omega 6, 193194.CrossRefGoogle Scholar
Glazebrook, K. D. (1979) Stoppable families of alternative bandit processes. J. Appl. Prob. 16, 843854.CrossRefGoogle Scholar
Glazebrook, K. D. (1982) On a sufficient condition for superprocesses due to Whittle. J. Appl. Prob. 19, 99110.CrossRefGoogle Scholar
Glazebrook, K. D. (1987) Evaluating strategies for Markov decision processes in parallel (submitted).Google Scholar
Glazebrook, K. D. and Fay, N. A. (1987) On the scheduling of alternative stochastic jobs on a single machine. Adv. Appl. Prob. 19, 955973.CrossRefGoogle Scholar
Glazebrook, K. D. and Fay, N. A. (1987a) Evaluating strategies for generalised bandit problems. Int. J. Syst. Sci. 19, 16051613.CrossRefGoogle Scholar
Nash, P. (1973) Optimal Allocation of Resources Between Research Projects. , Cambridge University.Google Scholar
Nash, P. (1980) A generalized bandit problem. J. R. Statist. Soc. B 42, 165169.Google Scholar
Ross, S. M. (1970) Applied Probability Models with Optimisation Applications. Holden–Day, San Francisco.Google Scholar
Whittle, P. (1980) Multi-armed bandits and the Gittins index. J. R. Statist. Soc. B 42, 143149.Google Scholar