Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-28T05:17:58.043Z Has data issue: false hasContentIssue false

Discounted Cost Markov Decision Processes with a Constraint

Published online by Cambridge University Press:  27 July 2009

Kazuyoshi Wakuta
Affiliation:
Nagaoka Technical College, 888 Nishikatakai, Nagaoka, Niigata 940, Japan

Abstract

We consider a discounted cost Markov decision process with a constraint. Relating this to a vector-valued Markov decision process, we prove that there exists a constrained optimal randomized semistationary policy if there exists at least one policy satisfying a constraint. Moreover, we present an algorithm by which we can find the constrained optimal randomized semistationary policy, or we can discover that there exist no policies satisfying a given constraint.

Type
Research Article
Copyright
Copyright © Cambridge University Press 1998

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Altman, E. (1994). Denumerable constrained Markov decision processes and finite approximations. Mathematics of Operations Research 19: 169191.CrossRefGoogle Scholar
2.Altman, E. & Shwartz, A. (1991). Sensitivity of constrained Markov decision processes. Annals of Operations Research 32: 122.CrossRefGoogle Scholar
3.Beutler, F.J. & Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. Journal of Mathematical Analysis and Applications 112: 236252.CrossRefGoogle Scholar
4.Beutler, F.J. & Ross, K.W. (1986). Time-average optimal constrained semi-Markov decision processes. Advances in Applied Probability 18: 341359.CrossRefGoogle Scholar
5.Chitgopekar, S.S. (1975). Denumerable state Markovian sequential control processes: On randomizations of optimal policies. Naval Research Logistics Quarterly 22: 567573.CrossRefGoogle Scholar
6.Frid, E.B. (1972). On optimal strategies in control problems with constraints. Theory of Probability and Its Applications 17: 188192.CrossRefGoogle Scholar
7.Hinderer, K. (1970). Foundations of non-stationary dynamic programming with discrete time parameter. Berlin: Springer-Verlag.CrossRefGoogle Scholar
8.Kallenberg, L.C.M. (1983). Linear programming and finite Markovian control problems. In Mathematical Centre Tracts 148. Amsterdam: CWI.Google Scholar
9.Liu, J. & Liu, K. (1994). Markov decision programming with constraints. Acta Mathematicae Applicatae Sinica 10: 111.CrossRefGoogle Scholar
10.Puterman, M.L. (1994). Markov decision processes. New York: Wiley.CrossRefGoogle Scholar
11.Sennott, L.I. (1991). Constrained discounted Markov decision chains. Probability in the Engineering and Informational Sciences 5: 463475.Google Scholar
12.Stoer, J. & Witzgall, C. (1970). Convexity and optimization infinite dimensions I. Berlin: Springer-Verlag.CrossRefGoogle Scholar
13.Wakuta, K. (1995). Vector-valued Markov decision processes and the systems of linear inequalities. Stochastic Processes and Their Applications 56: 159169.CrossRefGoogle Scholar
14.Wakuta, K. (1996). A new class of policies in vector-valued Markov decision processes. Journal of Mathematical Analysis and Applications 202: 623628.CrossRefGoogle Scholar