Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria

Yonghui Huang; Zhaotong Lian; Xianping Guo

doi:10.1017/apr.2018.36

Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria

Part of: Stochastic systems and control

Published online by Cambridge University Press: 16 November 2018

Yonghui Huang ,

Zhaotong Lian and

Xianping Guo

Show author details

Yonghui Huang*: Affiliation:
Sun Yat-Sen University
Zhaotong Lian*: Affiliation:
University of Macau
Xianping Guo*: Affiliation:
Sun Yat-Sen University
*: * Postal address: School of Mathematics, Sun Yat-Sen University, Guangzhou, 510275, China.
*** Postal address: Faculty of Business Administration, University of Macau, Macau, China. Email address: lianzt@umac.mo
* Postal address: School of Mathematics, Sun Yat-Sen University, Guangzhou, 510275, China.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In this paper we investigate risk-sensitive semi-Markov decision processes with a Borel state space, unbounded cost rates, and general utility functions. The performance criteria are several expected utilities of the total cost in a finite horizon. Our analysis is based on a type of finite-horizon occupation measure. We express the distribution of the finite-horizon cost in terms of the occupation measure for each policy, wherein the discount is not needed. For unconstrained and constrained problems, we establish the existence and computation of optimal policies. In particular, we develop a linear program and its dual program for the constrained problem and, moreover, establish the strong duality between the two programs. Finally, we provide two special cases of our results, one of which concerns the discrete-time model, and the other the chance-constrained problem.

Keywords

Semi-Markov decision process finite-horizon cost occupation measure expected utility linear program

MSC classification

Primary: 90C40: Markov and semi-Markov decision processes

Secondary: 93E20: Optimal stochastic control

Information

Type: Original Article
Information: Advances in Applied Probability , Volume 50 , Issue 3 , September 2018 , pp. 783 - 804

DOI: https://doi.org/10.1017/apr.2018.36 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

[1]Bäuerle, N. and Rieder, U. (2011). Markov Decision Processes with Applications to Finance. Springer, Heidelberg.Google Scholar

[2]Bäuerle, N. and Rieder, U. (2014). More risk-sensitive Markov decision processes. Math. Operat. Res. 39, 105–120.Google Scholar

[3]Beutler, F. J. and Ross, K. W. (1986). Time-average optimal constrained semi-Markov decision processes. Adv. Appl. Prob. 18, 341–359.Google Scholar

[4]Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.Google Scholar

[5]Cavazos-Cadena, R. and Montes-de-Oca, R. (2005). Nonstationary value iteration in controlled Markov chains with risk-sensitive average criterion. J. Appl. Prob. 42, 905–918.Google Scholar

[6]Chávez-Rodríguez, S., Cavazos-Cadena, R. and Cruz-Suárez, H. (2016). Controlled semi-Markov chains with risk-sensitive average cost criterion. J. Optim. Theory Appl. 170, 670–686.Google Scholar

[7]Chung, K. J. and Sobel, M. J. (1987). Discounted MDPs: distribution functions and exponential utility maximization. SIAM J. Control Optimization 25, 49–62.Google Scholar

[8]Di Masi, G. B. and Stettner, Ł. (2007). Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optimization 46, 231–252.Google Scholar

[9]Feinberg, E. A. and Rothblum, U. G. (2012). Splitting randomized stationary policies in total-reward Markov decision processes. Math. Operat. Res. 37, 129–153.Google Scholar

[10]Ghosh, M. and Saha, S. (2014). Risk-sensitive control of continuous time Markov chains. Stochastics 86, 655–675.Google Scholar

[11]Guo, X., Vykertas, M. and Zhang, Y. (2013). Absorbing continuous-time Markov decision processes with total cost criteria. Adv. Appl. Prob. 45, 490–519.Google Scholar

[12]Haskell, W. B. and Jain, R. (2013). Stochastic dominance-constrained Markov decision processes. SIAM J. Control Optimization 51, 273–303.Google Scholar

[13]Haskell, W. B. and Jain, R. (2015). A convex analytic approach to risk-aware Markov decision processes. SIAM J. Control Optimization 53, 1569–1598.Google Scholar

[14]Hernández-Hernández, D. and Marcus, S. I. (1999). Existence of risk-sensitive optimal stationary policies for controlled Markov processes. Appl. Math. Optimization 40, 273–285.Google Scholar

[15]Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes. Springer, New York.Google Scholar

[16]Huang, Y. and Guo, X. (2009). Optimal risk probability for first passage models in semi-Markov decision processes. J. Math. Anal. Appl. 359, 404–420.Google Scholar

[17]Mamer, J. W. (1986). Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation. Operat. Res. 34, 638–644.Google Scholar

[18]Piunovskiy, A. and Zhang, Y. (2011). Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optimization 49, 2032–2061.Google Scholar

[19]Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.Google Scholar

[20]Rockafellar, R. T. (1974). Conjugate Duality and Optimization. SIAM, Philadelphia, PA.Google Scholar

[21]Ross, S. M. (1996). Stochastic Processes, 2nd edn. John Wiley, New York.Google Scholar

[22]Suresh Kumar, K. and Pal, C. (2015). Risk-sensitive ergodic control of continuous time Markov processes with denumerable state space. Stoch. Anal. Appl. 33, 863–881.Google Scholar

Article contents

Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria

Abstract

Keywords

MSC classification

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests