Published online by Cambridge University Press: 12 October 2006
Markov Decision Processes (MDPs) are a classical framework forstochastic sequential decision problems, based on an enumerated statespace representation. More compact and structured representations havebeen proposed: factorization techniques use state variablesrepresentations, while decomposition techniques are based on apartition of the state space into sub-regions and take advantage ofthe resulting structure of the state transition graph. We use a familyof probabilistic exploration-like planning problems in order to studythe influence of the modeling structure on the MDP solution. We firstdiscuss the advantages and drawbacks of a graph based representationof the state space, then present our comparisons of two decompositiontechniques, and propose to use a global approach combining both statespace factorization and decomposition techniques. On the explorationproblem instance, it is proposed to take advantage of the naturaltopological structure of the navigation space, which is partitionedinto regions. A number of local policies are optimized within eachregion, that become the macro-actions of the global abstract MDPresulting from the decomposition. The regions are the correspondingmacro-states in the abstract MDP. The global abstract MDP is obtainedin a factored form, combining all the initial MDP state variables andone macro-state “region” variable standing for the different possiblemacro-states corresponding to the regions. Further research ispresently conducted on efficient solution algorithms implementing thesame hybrid approach for tackling large size MDPs.