Hostname: page-component-78c5997874-t5tsf Total loading time: 0 Render date: 2024-11-14T08:44:41.925Z Has data issue: false hasContentIssue false

A general markov decision method I: Model and techniques

Published online by Cambridge University Press:  01 July 2016

G. De Leve
Affiliation:
Mathematisch Centrum, Amsterdam
A. Federgruen
Affiliation:
Mathematisch Centrum, Amsterdam
H. C. Tijms
Affiliation:
Mathematisch Centrum, Amsterdam

Abstract

This paper provides a new approach for solving a wide class of Markov decision problems including problems in which the space is general and the system can be continuously controlled. The optimality criterion is the long-run average cost per unit time. We decompose the decision processes into a common underlying stochastic process and a sequence of interventions so that the decision processes can be embedded upon a reduced set of states. Consequently, in the policy-iteration algorithm resulting from this approach the number of equations to be solved in any iteration step can be substantially reduced. Further, by its flexibility, this algorithm allows us to exploit any structure of the particular problem to be solved.

Type
Research Article
Copyright
Copyright © Applied Probability Trust 1977 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Breiman, L. (1968) Probability. Addison-Wesley, Reading, Massachusetts.Google Scholar
[2] De Leve, G. (1964) Generalized Markovian Decision Processes, Part I: Model and Method. Part II: Probabilistic background. Mathematical Centre Tracts No. 3 and 4, Mathematisch Centrum, Amsterdam.Google Scholar
[3] De Leve, G., Tijms, H. C. and Weeda, P. J. (1970) Generalized Markovian Decision Processes Applications. Mathematical Centre Tract No. 5, Mathematisch Centrum, Amsterdam.Google Scholar
[4] De Leve, G., Federgruen, A. and Tijms, H. C. (1977) A general Markov decision method II: Applications. Adv. Appl. Prob. 9.Google Scholar
[5] De Leve, G., Federgruen, A. and Tijms, H. C. (1977) Generalized Markovian Decision Processes, Revisited. Mathematical Centre Tract, Mathematisch Centrum, Amsterdam (to appear).Google Scholar
[6] Derman, C. and Veinott, A. F. Jr. (1967) A solution to a countable system of equations arising in Markovian decision processes. Ann. Math. Statist. 38, 582584.Google Scholar
[7] Derman, C. (1970) Finite State Markovian Decision Processes. Academic Press, New York.Google Scholar
[8] Feller, W. (1957) An Introduction to Probability Theory and its Applications. Vol. 1, 2nd edn. Wiley, New York.Google Scholar
[9] Feller, W. (1966) An Introduction to Probability Theory and its Applications. Vol. 2. Wiley, New York.Google Scholar
[10] Howard, R. A. (1960) Dynamic Programming and Markov Processes. M.I.T. Press, Cambridge, Mass. Google Scholar
[11] Jain, N. C. (1966) Some limit theorems for general Markov processes. Z. Wahrscheinlichkeitsth. 5, 206223.Google Scholar
[12] Jewell, W. S. (1963) Markov revewal programming, I: Formulation, finite return models. II: Infinite return models. Opns. Res. 6, 938972.Google Scholar
[13] Ross, S. M. (1970) Applied Probability Models with Optimization Applications. Holden-Day, San Francisco.Google Scholar
[14] Royden, H. L. (1968) Real Analysis. 2nd. edn. Macmillan, New York.Google Scholar
[15] Weeda, P. J. (1974) Some computational experiments with a special generalized Markov programming model. Report BW 37/74, Mathematisch Centrum, Amsterdam.Google Scholar
[16] Weeda, P. J. (1976) Generalized Markov Programming and Markov Renewal Programming. Mathematical Centre Tract, Mathematisch Centrum, Amsterdam. (To appear).Google Scholar