Skip to main content Accessibility help
×
Hostname: page-component-5b777bbd6c-vfh8q Total loading time: 0 Render date: 2025-06-22T05:34:49.152Z Has data issue: false hasContentIssue false

References

Published online by Cambridge University Press:  19 May 2025

Malik Ghallab
Affiliation:
LAAS-CNRS, Toulouse
Dana Nau
Affiliation:
University of Maryland, College Park
Paolo Traverso
Affiliation:
Fondazione Bruno Kessler, Trento, Italy
Michela Milano
Affiliation:
Università degli Studi, Bologna, Italy
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Aarup, M., et al. OPTIMUM-AIV: A knowledge-based planning and scheduling system for spacecraft AIV. In Intelligent Scheduling. Morgan Kaufmann, 1994.Google Scholar
Azar, N. Ab, et al. From inverse optimal control to inverse reinforcement learning: A historical review. Annual Reviews in Control, 2020.Google Scholar
Abbeel, P. and Ng, A. Y.. Apprenticeship learning via inverse reinforcement learning. In ICML, 2004.CrossRefGoogle Scholar
Abbeel, P., et al. An application of reinforcement learning to aerobatic helicopter flight. In NeurIPS, 2006.CrossRefGoogle Scholar
Abbeel, P., et al. Using inaccurate models in reinforcement learning. In ICML, 2006.CrossRefGoogle Scholar
Abbeel, P., et al. Autonomous helicopter aerobatics through apprenticeship learning. IJRR, 2010.CrossRefGoogle Scholar
Abdul-Razaq, T. and Potts, C.. Dynamic programming state-space relaxation for single-machine scheduling. Jour. of the Operational Research Society, 1988.CrossRefGoogle Scholar
Abdulaziz, M. and Kurz, F.. Formally verified sat-based AI planning. In AAAI, 2023.CrossRefGoogle Scholar
Achiam, J., et al. Towards characterizing divergence in deep Q-learning. arXiv:1903.08894, 2019.Google Scholar
Adali, S., et al. Representing and reasoning with temporal constraints in multimedia presentations. In TIME, 2000.Google Scholar
Addison, U.. Human-inspired goal reasoning implementations: A survey. Cognitive Systems Research, 2024.CrossRefGoogle Scholar
Aggarwal, C. C.. Neural Networks and Deep Learning: A Textbook. MIT Press, 2018.CrossRefGoogle Scholar
Agosta, J. M.. Formulation and implementation of an equipment configuration problem with the SIPE-2 generative planner. In AAAI-95 Spring Symposium on Integrated Planning Applications, 1995.Google Scholar
Agravante, D. J., et al. Learning neurosymbolic world models with conversational proprioception. In Annual Meeting of the Association for Computational Linguistics, 2023.CrossRefGoogle Scholar
Aguas, J. S., et al. Synthesis of procedural models for deterministic transition systems. arXiv:2307.14368, 2023.Google Scholar
Aha, D. W.. Goal reasoning: Foundations, emerging applications, and prospects. AI Magazine, 2018.CrossRefGoogle Scholar
Ahn, M., et al. Do as I can, not as I say: Grounding language in robotic affordances. arXiv:2204.01691, 2022.Google Scholar
Aineto, D., et al. Learning STRIPS action models with classical planning. AIJ, 2019.CrossRefGoogle Scholar
Bouhsain, S. Ait, et al. Simultaneous action and grasp feasibility prediction for task and motion planning through multitask learning. In IROS, 2023.CrossRefGoogle Scholar
Bouhsain, S. Ait, et al. Extending task and motion planning with feasibility prediction: Towards multi-robot manipulation planning of realistic objects. In IROS, 2024.CrossRefGoogle Scholar
Akkaya, I., et al. Solving Rubik’s cube with a robot hand. arXiv:1910.07113, 2019.Google Scholar
Alami, R., et al. A geometrical approach to planning manipulation tasks. the case of discrete placements and grasps. In ISRR, 1989.Google Scholar
Albore, A. and Bertoli, P.. Generating safe assumption-based plans for partially observable, nondeterministic domains. In AAAI, 2004.Google Scholar
Alford, R., et al. Translating HTNs to PDDL: A small amount of domain knowledge can go a long way. In IJCAI, 2009.Google Scholar
Alford, R., et al. Plan aggregation for strong cyclic planning in nondeterministic domains. AIJ, 2014.CrossRefGoogle Scholar
Alford, R., et al. On the feasibility of planning graph style heuristics for HTN planning. In ICAPS, 2014.CrossRefGoogle Scholar
Alford, R., et al. Tight bounds for HTN planning with task insertion. In IJCAI, 2015.CrossRefGoogle Scholar
Alford, R., et al. Bound to plan: Exploiting classical heuristics via automatic translations of tail-recursive HTN problems. In ICAPS, 2016.CrossRefGoogle Scholar
Allen, J.. Towards a general theory of action and time. AIJ, 1984.CrossRefGoogle Scholar
Allen, J.. Temporal reasoning and planning. In Reasoning about Plans. Morgan Kaufmann, 1991.Google Scholar
Allen, J. F.. Maintaining knowledge about temporal intervals. CACM, 1983.CrossRefGoogle Scholar
Allen, J. F.. Planning as temporal reasoning. In KR, 1991.Google Scholar
Allen, J. F. and Koomen, J. A.. Planning using a temporal world model. In IJCAI, 1983.Google Scholar
Altman, E.. Constrained Markov Decision Processes. CRC Press, 1999.Google Scholar
Aluru, S.. Lagged Fibonacci random number generators for distributed memory parallel computers. Jour. of Parallel and Distributed Computing, 1997.CrossRefGoogle Scholar
Ambros-Ingerson, J. A. and Steel, S.. Integrating planning, execution and monitoring. In AAAI, 1988.Google Scholar
Amir, E. and Chang, A.. Learning partially observable deterministic action models. JAIR, 2008.CrossRefGoogle Scholar
Anderson, G., et al. Neurosymbolic reinforcement learning with formally verified exploration. In NeurIPS, 2020.Google Scholar
Andre, D. and Russell, S. J.. State abstraction for programmable reinforcement learning agents. In AAAI, 2002.Google Scholar
Andre, D., et al. Generalized prioritized sweeping. In NeurIPS, 1997.Google Scholar
Andreas, J., et al. Modular multitask reinforcement learning with policy sketches. In ICML, 2017.Google Scholar
Andrychowicz, M., et al. Hindsight experience replay. In NeurIPS, 2017.Google Scholar
Angluin, D., et al. Learning regular languages via alternating automata. In IJCAI, 2015.CrossRefGoogle Scholar
Ans, B., et al. Self-refreshing memory in artificial neural networks: Learning temporal sequences without catastrophic forgetting. Connection Science, 2004.CrossRefGoogle Scholar
Anthropic AI. The Claude-3 model family: Opus, sonnet, haiku, 2024. Online report.Google Scholar
Antonelli, G., et al. Underwater robotics. In Handbook of Robotics. Springer, 2008.Google Scholar
Araya-Lopez, M., et al. A closer look at MOMDPs. In ICTAI, 2010.CrossRefGoogle Scholar
Arfaee, S. J., et al. Learning heuristic functions for large state spaces. AIJ, 2011.Google Scholar
Argall, B. D., et al. A survey of robot learning from demonstration. RAS, 2009.CrossRefGoogle Scholar
Arjona-Medina, J. A., et al. Rudder: Return decomposition for delayed rewards. In NeurIPS, 2019.Google Scholar
Arora, S. and Doshi, P.. A survey of inverse reinforcement learning: Challenges, methods and progress. AIJ, 2021.CrossRefGoogle Scholar
Arulkumaran, K., et al. A brief survey of deep reinforcement learning. arXiv:1708.05866, 2017.Google Scholar
Arumugam, D., et al. Deep reinforcement learning from policy-dependent human feedback. arXiv:1902.04257, 2019.Google Scholar
Arumugam, D., et al. An informationtheoretic perspective on credit assignment in reinforcement learning. arXiv:2103.06224, 2021.Google Scholar
Cruz, C. Arzate and Igarashi, T.. A survey on interactive reinforcement learning: Design principles and open challenges. In ACM Designing Interactive Systems Conf., 2020.Google Scholar
Asai, M.. Unsupervised grounding of plannable first-order logic representation from images. In ICAPS, 2019.CrossRefGoogle Scholar
Asai, M. and Fukunaga, A.. Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary. In AAAI, 2018.CrossRefGoogle Scholar
Ȧström, K. J.. Optimal control of Markov decision processes with incomplete state estimation. J. of Mathematical Analysis and Applications, 1965.CrossRefGoogle Scholar
Atramentov, A. and LaValle, S. M.. Efficient nearest neighbor searching for motion planning. In ICRA, 2002.Google Scholar
Attia, A. and Dayan, S.. Global overview of imitation learning. arXiv:1801.06503, 2018.Google Scholar
Au, T.-C. and Nau, D. S.. The incompleteness of planning with volatile external information. In ECAI, 2006.Google Scholar
Au, T.-C., et al. On the complexity of plan adaptation by derivational analogy in a universal classical planning framework. In European Conf. on Case-Based Reasoning, 2002.CrossRefGoogle Scholar
Ayan, N. F., et al. HOTRiDE: Hierarchical ordered task replanning in dynamic environments. In ICAPS Wksh. on Planning and Plan Execution for Real-World Systems, 2007.Google Scholar
Baader, F., et al., editors. The Description Logic Handbook: Theory, Implementation and Applications. Cambridge Univ. Press, 2003.Google Scholar
BAAI, P.. Plan4mc: Skill reinforcement learning and planning for open-world minecraft tasks. arXiv:2303.16563, 2023.Google Scholar
Bacchus, F. and Kabanza, F.. Using temporal logics to express search control knowledge for planning. AIJ, 2000.CrossRefGoogle Scholar
Bacchus, F. and Yang, Q.. The downward refinement property. In IJCAI, 1991.Google Scholar
Bäckström, C.. Planning in polynomial time: The SAS-PUB class. CI, 1991.CrossRefGoogle Scholar
Bäckström, C. and Nebel, B.. Complexity results for SAS+ planning. In IJCAI, 1993.Google Scholar
Bäckström, C. and Nebel, B.. Complexity results for SAS+ planning. CI, 1995.CrossRefGoogle Scholar
Badouel, E., et al. Petri Net Synthesis. Springer, 2015.CrossRefGoogle Scholar
Badreddine, S. and Spranger, M.. Injecting prior knowledge for transfer learning into reinforcement learning algorithms using logic tensor networks. arXiv:1906.06576, 2019.Google Scholar
Badreddine, S., et al. Logic tensor networks. AIJ, 2022.CrossRefGoogle Scholar
Bai, A. and Russell, S.. Efficient reinforcement learning with hierarchies of machines by leveraging internal transitions. In IJCAI, 2017.CrossRefGoogle Scholar
Bai, T., et al. Temporal graph neural networks for social recommendation. In IEEE Internat. Conf. on Big Data, 2020.CrossRefGoogle Scholar
Baier, J. A., et al. A heuristic search approach to planning with temporally extended preferences. AIJ, 2009.CrossRefGoogle Scholar
Baier, J. A., et al. Diagnostic problem solving: A planning perspective. In KR, 2014Google Scholar
Ball, M. and Holte, R. C.. The compression power of symbolic pattern databases. In ICAPS, 2008.Google Scholar
Bansod, Y., et al. HTN replanning from the middle. In FLAIRS, May 2022.CrossRefGoogle Scholar
Baptiste, P., et al. Constraint-based scheduling and planning. In Handbook of Constraint Programming. Elsevier, 2006.Google Scholar
Barbier, M., et al. Implementation and flight testing of an onboard architecture for mission supervision. In Int. Unmanned Air Vehicle Systems Conf., 2006.Google Scholar
Barraquand, J., et al. Numerical potential field techniques for robot path planning. SMC, 1992.CrossRefGoogle Scholar
Barraquand, J., et al. A random sampling scheme for path planning. IJRR, 1997.CrossRefGoogle Scholar
Barrett, A. and Weld, D. S.. Characterizing subgoal interactions for planning. In IJCAI, 1993.Google Scholar
Barrett, A. and Weld, D. S.. Partial order planning: Evaluating possible efficiency gains. AIJ, 1994.CrossRefGoogle Scholar
Barrett, A., et al. UCPOP user’s manual. Technical Report TR-93-09-06, Univ. of Washington, 1993.Google Scholar
Barrett, C., et al. The SMT-LIB standard: Version 2.0. In Int. Wksh. on Satisfiability Modulo Theories, 2010.Google Scholar
Barrett, S., et al. Transfer learning for reinforcement learning on a physical robot. In AAMAS Adaptive Learning Agents Wksh., 2010.Google Scholar
Barry, J., et al. DetH*: Approximate hierarchical solution of large Markov decision processes. In IJCAI, 2011.Google Scholar
Barták, R. and Toropila, D.. Reformulating constraint models for classical planning. In FLAIRS, 2008.Google Scholar
Barták, R. and Toropila, D.. Enhancing constraint models for planning problems. In FLAIRS, 2009.CrossRefGoogle Scholar
Barták, R., et al. An Introduction to Constraint-Based Temporal Reasoning. Morgan & Claypool, 2014.CrossRefGoogle Scholar
Barták, R., et al. Validation of hierarchical plans via parsing of attribute grammars. In ICAPS, 2018.CrossRefGoogle Scholar
Barták, R., et al. A novel parsing-based approach for verification of hierarchical plans. In ICTAI, 2020.CrossRefGoogle Scholar
Barták, R., et al. Correcting hierarchical plans by action deletion. In KR, 2021.CrossRefGoogle Scholar
Barto, A. and Duff, M.. Monte Carlo matrix inversion and reinforcement learning. In NeurIPS, 1993.Google Scholar
Barto, A. G., et al. Associative search network: A reinforcement learning associative memory. Biological Cybernetics, 1981.CrossRefGoogle Scholar
Barto, A. G., et al. Learning to act using real-time dynamic-programming. AIJ, 1995.CrossRefGoogle Scholar
Bauters, K., et al. CAN (PLAN)+: Extending the operational semantics of the BDI architecture to deal with uncertain information. In UAI, 2014.Google Scholar
Bauters, K., et al. Anytime algorithms for solving possibilistic MDPs and hybrid MDPs. In FoIKS, 2016.CrossRefGoogle Scholar
Beetz, M.. Structured reactive controllers: Controlling robots that perform everyday activity. In Int Conf. on Autonomous Agents, 1999.CrossRefGoogle Scholar
Beetz, M. and McDermott, D.. Declarative goals in reactive plans. In AIPS, 1992.Google Scholar
Beetz, M. and McDermott, D.. Improving robot plans during their execution. In AIPS, 1994.Google Scholar
Behnke, G.. Hierarchical planning through propositional logic. PhD thesis, Ulm Univ., 2019.Google Scholar
Behnke, G., et al. This is a solution! (... but is it though?): Verifying solutions of hierarchical planning problems. In ICAPS, 2017.CrossRefGoogle Scholar
Behnke, G., et al. totSAT – totally-ordered hierarchical planning through SAT. In AAAI, 2018.CrossRefGoogle Scholar
Behnke, G., et al. Finding optimal solutions in HTN planning-a SAT-based approach. In IJCAI, 2019.CrossRefGoogle Scholar
Bellman, R.. Dynamic Programming. Princeton Univ. Press, 1957.Google ScholarPubMed
Lamine, K. Ben and Kabanza, F.. Reasoning about robot actions: A model checking approach. In Advances in Plan-Based Control of Robotic Agents. 2002.CrossRefGoogle Scholar
Bengio, Y., et al. A neural probabilistic language model. JMLR, 2003.Google Scholar
Bengio, Y., et al. Curriculum learning. In ICML, 2009.CrossRefGoogle Scholar
Bercher, P., et al. Hybrid planning heuristics based on task decomposition graphs. In SOCS, 2014.Google Scholar
Bercher, P., et al. An admissible HTN planning heuristic. In IJCAI, 2017.CrossRefGoogle Scholar
Berghammer, R. and Zierer, H.. Relational algebraic semantics of deterministic and nondeterministic programs. TCS, 1986.CrossRefGoogle Scholar
Bernard, D., et al. Remote agent experiment DS1 technology validation report. Technical report, NASA, 2000.Google Scholar
Bernardinello, L. and Petrucci, L., editors. Int. Conf. on Applications and Theory of Petri Nets and Concurrency, 2022. Springer.CrossRefGoogle Scholar
Bernardini, S. and Smith, D.. Finding mutual exclusion invariants in temporal planning domains. In IWPSS, 2011.Google Scholar
Bernardini, S. and Smith, D. E.. Developing domain-independent search control for Europa2. In HDIP, 2007.Google Scholar
Bernardini, S. and Smith, D. E.. Automatically generated heuristic guidance for Europa2. In i-SAIRAS, 2008.Google Scholar
Bernardini, S. and Smith, D. E.. Towards search control via dependency graphs in Europa2. In HDIP, 2009.Google Scholar
Berner, C., et al. Dota 2 with large scale deep reinforcement learning. arXiv:1912.06680, 2019.Google Scholar
Berthomieu, B., et al. The tool TINA – construction of abstract state spaces for Petri nets and time Petri nets. Int. Jour. of Production Research, 2004.CrossRefGoogle Scholar
Bertoli, P., et al. MBP: A model based planner. In IJCAI Wksh. on Planning under Uncertainty and Incomplete Information, 2001.Google Scholar
Bertoli, P., et al. Planning in nondeterministic domains under partial observability via symbolic model checking. In IJCAI, 2001.Google Scholar
Bertoli, P., et al. A framework for planning with extended goals under partial observability. In ICAPS, 2003.Google Scholar
Bertoli, P., et al. Interleaving execution and planning for nondeterministic, partially observable domains. In ECAI, 2004.Google Scholar
Bertoli, P., et al. Strong planning under partial observability. AIJ, 2006.CrossRefGoogle Scholar
Bertoli, P., et al. Automated composition of Web services via planning in asynchronous domains. AIJ, 2010.CrossRefGoogle Scholar
Bertsekas, D.. Dynamic Programming and Optimal Control. Athena Scientific, 2001.Google Scholar
Bertsekas, D. and Tsitsiklis, J.. Neuro-Dynamic Programming. Athena Scientific, 1996.Google Scholar
Bertsekas, D. P. and Tsitsiklis, J. N.. An analysis of stochastics shortest path problems. Mathematics of Operations Research, 1991.CrossRefGoogle Scholar
Betz, C. and Helmert, M.. Planning with h+ in theory and practice. In KI, 2009.CrossRefGoogle Scholar
Beutner, R. and Finkbeiner, B.. Nondeterministic planning for hyperproperty verification. In ICAPS, 2024.CrossRefGoogle Scholar
Bhandari, J. and Russo, D.. Global optimality guarantees for policy gradient methods. arXiv:1906.01786, 2019.Google Scholar
Bhatnagar, S., et al. Incremental natural actor-critic algorithms. In NeurIPS, 2007.Google Scholar
Bianchi, F., et al. Monte Carlo Tree Search Planning for continuous action and state space. In Italian Wksh. on AI Robotics. 2022.Google Scholar
Bianchi, R. A., et al. Heuristic selection of actions in multiagent reinforcement learning. In IJCAI, 2007.Google Scholar
Bianchi, R. A., et al. Accelerating autonomous learning by using heuristic selection of actions. Jour. of Heuristics, 2008.Google Scholar
Bianchi, R. A., et al. Transferring knowledge as heuristics in reinforcement learning: A case-based approach. AIJ, 2015.CrossRefGoogle Scholar
Bidot, J., et al. Plan repair in hybrid planning. In KI, 2008.Google Scholar
Bidot, J., et al. Geometric backtracking for combined task and motion planning in robotic systems. AIJ, 2015.Google Scholar
Bit-Monnot, A., et al. FAPE: A constraint-based planner for generative and hierarchical temporal planning. arXiv:2010.13121, 2020.Google Scholar
Biundo, S. and Schattenberg, B.. From abstract crisis to concrete relief – A preliminary report on combining state abstraction and HTN planning. In ECP, 2001.Google Scholar
Blum, A. and Langford, J.. Probabilistic planning in the graphplan framework. In ECP, 1999.CrossRefGoogle Scholar
Blum, A. L. and Furst, M. L.. Fast planning through planning graph analysis. AIJ, 1997.CrossRefGoogle Scholar
Boddy, M. and Dean, T.. Solving timedependent planning problems. In IJCAI, 1989.Google Scholar
Bodik, R., et al. Programming with angelic nondeterminism. In POPL, 2010.CrossRefGoogle Scholar
Boeing, A. and Bräunl, T.. Evaluation of real-time physics simulation systems. In Int. Conf. on Computer Graphics and Interactive Techniques in Australia and Southeast Asia, 2007.CrossRefGoogle Scholar
Bohren, J., et al. Towards autonomous robotic butlers: Lessons learned with the PR2. In ICRA, 2011.CrossRefGoogle Scholar
Bommasani, R., et al. On the opportunities and risks of foundation models. arXiv:2108.07258, 2022.Google Scholar
Bonassi, L., et al. FOND planning for pure-past linear temporal logic goals. In ECAI, 2023.CrossRefGoogle Scholar
Bonassi, L., et al. Planning for temporally extended goals in pure-past linear temporal logic. In ICAPS, 2023.CrossRefGoogle Scholar
Bonet, B.. On the speed of convergence of value iteration on stochastic shortest-path problems. Mathematics of Operations Research, 2007.CrossRefGoogle Scholar
Bonet, B. and Geffner, H.. Planning with incomplete information as heuristic search in belief space. In AIPS, 2000.CrossRefGoogle Scholar
Bonet, B. and Geffner, H.. Planning as heuristic search. AIJ, 2001.CrossRefGoogle Scholar
Bonet, B. and Geffner, H.. Faster heuristic search algorithms for planning with uncertainty and full feedback. In IJCAI, 2003.Google Scholar
Bonet, B. and Geffner, H.. Labeled RTDP: Improving the convergence of real-time dynamic programming. In ICAPS, 2003.Google Scholar
Bonet, B. and Geffner, H.. mGPT: A probabilistic planner based on heuristic search. JAIR, 2005.CrossRefGoogle Scholar
Bonet, B. and Geffner, H.. Learning in depth-first search: A unified approach to heuristic search in deterministic, nondeterministic, probabilistic, and game tree settings. In ICAPS, 2006.Google Scholar
Bonet, B. and Geffner, H.. Solving POMDPs: RTDP-Bel vs. point-based algorithms. In IJCAI, 2009.Google Scholar
Bonet, B. and Geffner, H.. Action selection for MDPs: Anytime AO* versus UCT. In AAAI, 2012.Google Scholar
Bonet, B. and Geffner, H.. Belief tracking for planning with sensing: Width, complexity and approximations. JAIR, 2014.CrossRefGoogle Scholar
Bonet, B. and Geffner, H.. Learning firstorder symbolic planning representations from plain graphs. arXiv:1909.05546, 2019.Google Scholar
Bonet, B. and Helmert, M.. Strengthening landmark heuristics via hitting sets. In ECAI, 2010.Google Scholar
Bonet, B., et al. Directed unfolding of Petri nets. Trans. Petri Nets Other Model. Concurr., 2008.CrossRefGoogle Scholar
Borst, C., et al. Rollin’ Justin - Mobile platform with variable base. In ICRA, 2009.CrossRefGoogle Scholar
Bouguerra, A., et al. Semantic knowledgebased execution monitoring for mobile robots. In ICRA, 2007.CrossRefGoogle Scholar
Boutilier, C., et al. Structured reachability analysis for Markov decision processes. In UAI, 1998.Google Scholar
Boutilier, C., et al. Decision-theoretic planning: Structural assumptions and computational leverage. JAIR, May 1999.CrossRefGoogle Scholar
Boutilier, C., et al. Stochastic dynamic programming with factored representations. AIJ, 2000.CrossRefGoogle Scholar
Boyan, J. and Littman, M.. Exact solutions to time-dependent MDPs. In NeurIPS, 2000.Google Scholar
Boyan, J. A. and Moore, A. W.. Learning evaluation functions to improve optimization by local search. JMLR, 2000.Google Scholar
Brafman, R. and Hoffmann, J.. Conformant planning via heuristic forward search: A new approach. In ICAPS, 2004.Google Scholar
Braunschweig, B. and Ghallab, M., editors. Reflections on Artificial Intelligence for Humanity. Springer, 2021.CrossRefGoogle Scholar
Brenner, M. and Nebel, B.. Continual planning and acting in dynamic multiagent environments. JAAMAS, 2009.CrossRefGoogle Scholar
Bridson, R.. Fluid Simulation for Computer Graphics. CRC Press, 2015.CrossRefGoogle Scholar
Brohan, M. A. A., et al. Do as I can, not as I say: Grounding language in robotic affordances. arXiv:2204.01691, 2022.Google Scholar
Brooks, R. and Lozano-Pérez, T.. A subdivision algorithm in configuration space for findpath with rotation. In IJCAI, 1983.Google Scholar
Brown, N. and Sandholm, T.. Superhuman AI for multiplayer poker. Science, 2019.CrossRefGoogle Scholar
Browne, C. B., et al. A survey of Monte Carlo tree search methods. IEEE Trans. on Computational Intelligence and AI in Games, 2012.CrossRefGoogle Scholar
Brusoni, V., et al. A spectrum of definitions for temporal model-based diagnosis. AIJ, 1998.CrossRefGoogle Scholar
Brusoni, V., et al. Qualitative and quantitative temporal constraints and relational databases: Theory, architecture, and applications. TDKE, 1999.CrossRefGoogle Scholar
Bruyère, V., et al. Active learning of mealy machines with timers. arXiv:2403.02019, 2024.Google Scholar
Brys, T., et al. Reinforcement learning from demonstration through shaping. In IJCAI, 2015.Google Scholar
Bubeck, S., et al. Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv:2303.12712, 2023.Google Scholar
Bucchiarone, A., et al. Domain objects for continuous context-aware adaptation of service-based systems. In ICWS, 2013.CrossRefGoogle Scholar
Büchner, C., et al. Exploiting cyclic dependencies in landmark heuristics. In ICAPS, 2021.CrossRefGoogle Scholar
Buffet, O. and Sigaud, O., editors. Markov decision processes in AI. Wiley, 2010.Google Scholar
Buro, M.. From simple features to sophisticated evaluation functions. In CG, 1998.CrossRefGoogle Scholar
Busoniu, L., et al. Optimistic planning for sparsely stochastic systems. In IEEE Symp. on Adaptive Dynamic Programming and Reinforcement Learning, 2011.CrossRefGoogle Scholar
Bylander, T.. Complexity results for extended planning. In AAAI, 1992.CrossRefGoogle Scholar
Bylander, T.. The computational complexity of propositional STRIPS planning. AIJ, 1994.CrossRefGoogle Scholar
Caldera, S., et al. Review of deep learning methods in robotic grasp detection. Multimodal Technologies and Interaction, 2018.CrossRefGoogle Scholar
Camacho, A., et al. Finite LTL synthesis with environment assumptions and quality measures. In KR, 2018.CrossRefGoogle Scholar
Cambon, S., et al. A hybrid approach to intricate motion, manipulation and task planning. IJRR, Jan. 2009.CrossRefGoogle Scholar
Campari, T., et al. Online learning of reusable abstract models for object goal navigation. In CVPR, 2022.CrossRefGoogle Scholar
Canny, J.. The Complexity of Robot Motion Planning. MIT press, 1988.Google Scholar
Cao, Y. and Lee, C.. Robot behavior-treebased task generation with large language models. arXiv:2302.12927, 2023.Google Scholar
Cao, Z., et al. Temporal videolanguage alignment network for reward shaping in reinforcement learning. arXiv:2302.03954, 2023.Google Scholar
Capitanelli, A. and Mastrogiovanni, F.. A framework to generate neurosymbolic PDDL-compliant planners. arXiv:2303.00438, 2023.Google Scholar
Carbonell, J., et al. PRODIGY : An integrated architecture for planning and learning. In Architectures for Intelligence. Erlbaum, L., 1990.Google Scholar
Cardoso, J. and Valette, R.. Petri Nets. Open Science, DOI:10.34849/zkrr-sn28, 2024.CrossRefGoogle Scholar
Carta, T., et al. Grounding large language models in interactive environments with online reinforcement learning. In ICML, 2023.Google Scholar
Castano, L. and Xu, H.. Safe decision making for risk mitigation of uas. In Int. Conf. on Unmanned Aircraft Systems, 2019.CrossRefGoogle Scholar
Castellini, C., et al. Improvements to SAT-based conformant planning. In ECP, 2001.Google Scholar
Castellini, C., et al. SAT-based planning in complex domains: Concurrency, constraints and nondeterminism. AIJ, 2003.CrossRefGoogle Scholar
Castillo, L., et al. Efficiently handling temporal knowledge in an HTN planner. In ICAPS, 2006.CrossRefGoogle Scholar
Castillo, L., et al. Automatic generation of temporal planning domains for e-learning problems. Journal of Scheduling, 2010.Google Scholar
Certicky, M.. Real-time action model learning with online algorithm 3SG. Applied Artificial Intelligence, 2014.CrossRefGoogle Scholar
Cesta, A. and Oddi, A.. Gaining efficiency and flexibility in the simple temporal problem. In TIME, 1996.Google Scholar
Cesta, A., et al. A constraint-based method for project scheduling with time windows. Jour. of Heuristics, 2002.Google Scholar
Champandard, A., et al. The AI for Killzone 2’s multiplayer bots. In GDC, 2009.Google Scholar
Chang, X., et al. A comprehensive survey of scene graphs: Generation and application. PAMI, 2021.Google Scholar
Charlesworth, H. J. and Montana, G.. Solving challenging dexterous manipulation tasks with trajectory optimisation and reinforcement learning. ML, 2021.Google Scholar
Charniak, E.. Introduction to Deep Learning. MIT Press, 2018.Google Scholar
Chaslot, G. M. J., et al. Progressive strategies for Monte-Carlo tree search. New Mathematics and Natural Computation, 2008.CrossRefGoogle Scholar
Chatilla, R., et al. Integrated planning and execution control of autonomous robot actions. In ICRA, 1992.Google Scholar
Chaumette, F. and Hutchinson, S.. Visual servoing and visual tracking. In Handbook of Robotics. Springer, 2008.Google Scholar
Chen, D. and Bercher, P.. Fully observable nondeterministic HTN planning– formalisation and complexity results. In ICAPS, 2021.CrossRefGoogle Scholar
Chen, D. Z., et al. Learning domainindependent heuristics for grounded and lifted planning. In AAAI, 2024.CrossRefGoogle Scholar
Chen, J., et al. Benchmarking large language models in retrieval-augmented generation. In AAAI, 2024.CrossRefGoogle Scholar
Chen, L., et al. Decision transformer: Reinforcement learning via sequence modeling. In NeurIPS, 2021.Google Scholar
Chen, P. C. and Hwang, Y. K.. Sandros: A dynamic graph search algorithm for motion planning. In ICRA, 1998.CrossRefGoogle Scholar
Chen, Y., et al. AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers. arXiv:2306.06531, 2023.Google Scholar
Chen, Z., et al. Graph neural networkbased fault diagnosis: A review. arXiv:2111.08185, 2021.Google Scholar
Cheng, C.-A., et al. Heuristic-guided reinforcement learning. In NeurIPS, 2021.Google Scholar
Chignoli, M., et al. The MIT humanoid robot: Design, motion planning, and control for acrobatic behaviors. In Humanoids, 2021.CrossRefGoogle Scholar
Chitnis, R., et al. Guided search for task and motion plans using learned heuristics. In ICRA, 2016.CrossRefGoogle Scholar
Choset, H., et al. Principles of Robot Motion: Theory, Algorithms, and Implementations. MIT press, 2005.Google Scholar
Christian, B.. The Alignment Problem: How can Machines Learn Human Values? Atlantic Books, 2021.Google Scholar
Christiano, P. F., et al. Deep reinforcement learning from human preferences. In NeurIPS, 2017.Google Scholar
Cimatti, A., et al. A provably correct embedded verifier for the certification of safety critical software. In CAV, 1997.CrossRefGoogle Scholar
Cimatti, A., et al. Automatic OBDD-based generation of universal plans in nondeterministic domains. In AAAI, 1998.Google Scholar
Cimatti, A., et al. Strong planning in non-deterministic domains via model checking. In AIPS, 1998.Google Scholar
Cimatti, A., et al. Weak, strong, and strong cyclic planning via symbolic model checking. AIJ, 2003.CrossRefGoogle Scholar
Cimatti, A., et al. Solving temporal problems using SMT: Strong controllability. In CP, 2012.CrossRefGoogle Scholar
Cimatti, A., et al. Solving temporal problems using SMT: Weak controllability. In AAAI, 2012.CrossRefGoogle Scholar
Cimatti, A., et al. Strong temporal planning with uncontrollable durations: A state-space approach. In AAAI, Jan. 2015.CrossRefGoogle Scholar
Claßen, J., et al. Platas – integrating planning and the action language Golog. In KI, 2012.CrossRefGoogle Scholar
Claussmann, L., et al. A review of motion planning for highway autonomous driving. TITS, 2019.Google Scholar
Co, J. D.-Reyes, et al. Self-consistent trajectory autoencoder: Hierarchical reinforcement learning with trajectory embeddings. In ICML, 2018.Google Scholar
Coates, A., et al. Apprenticeship learning for helicopter control. CACM, 2009.CrossRefGoogle Scholar
Cobo, L. C., et al. Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains. AIJ, 2014.CrossRefGoogle Scholar
Cohen, J.. Non-deterministic algorithms. CSUR, 1979.CrossRefGoogle Scholar
Coles, A. and Smith, A.. Marvin: A heuristic search planner with online macro-action learning. JAIR, 2007.CrossRefGoogle Scholar
Coles, A. I., et al. Planning with problems requiring temporal coordination. In AAAI, 2008.Google Scholar
Coles, A. J., et al. COLIN: Planning with continuous linear numeric change. JAIR, 2012.CrossRefGoogle Scholar
Colledanchise, M. and Natale, L.. Improving the parallel execution of behavior trees. In IROS, 2018.CrossRefGoogle Scholar
Colledanchise, M. and Natale, L.. Handling concurrency in behavior trees. TRO, 2022.CrossRefGoogle Scholar
Colledanchise, M. and Ögren, P.. Behavior Trees in Robotics and AI: An Introduction. CRC Press, 2018.CrossRefGoogle Scholar
Colledanchise, M., et al. Learning of behavior trees for autonomous agents. TG, 2019.CrossRefGoogle Scholar
Colledanchise, M., et al. Formalizing the execution context of behavior trees for runtime verification of deliberative policies. In IROS. 2021.CrossRefGoogle Scholar
Conrad, P., et al. Flexible execution of plans with choice. In ICAPS, 2009.CrossRefGoogle Scholar
Coradeschi, S. and Saffiotti, A.. Perceptual anchoring: A key concept for plan execution in embedded systems. In Advances in Plan-Based Control of Robotic Agents. Springer-Verlag, 2002.Google Scholar
Corke, P. I., et al. Mining robotics. In Handbook of Robotics. Springer, 2008.Google Scholar
Corrêa, R. C., et al. Insertion and sorting in a sequence of numbers minimizing the maximum sum of a contiguous subsequence. Jour. of Discrete Algorithms, 2013.CrossRefGoogle Scholar
Couetoux, A.. Monte Carlo tree search for continuous and stochastic sequential decision making problems. PhD thesis, Université Paris Sud, 2013.Google Scholar
Coulom, R.. Efficient selectivity and backup operators in Monte-Carlo tree search. In CG. 2006.CrossRefGoogle Scholar
Cram, D., et al. A complete chronicle discovery approach: application to activity analysis. Expert Systems, 2012.CrossRefGoogle Scholar
Cresswell, S., et al. Acquiring planning domain models using LOCM. KER, 2013.CrossRefGoogle Scholar
Culberson, J. C. and Schaeffer, J.. Pattern databases. CI, 1998.CrossRefGoogle Scholar
Currie, K. and Tate, A.. O-Plan: The open planning architecture. AIJ, 1991.CrossRefGoogle Scholar
Currie, K. and Tate, A.. O-Plan: The open planning architecture. AIJ, 1991.CrossRefGoogle Scholar
Cushing, W. and Kambhampati, S.. Replanning: A new perspective. In ICAPS, 2005. Poster.Google Scholar
Dagan, G., et al. Dynamic planning with a LLM. arXiv:2308.06391, 2023.Google Scholar
Dai, P. and Hansen, E. A.. Prioritizing Bellman backups without a priority queue. In ICAPS, 2007.Google Scholar
Lago, U. Dal, et al. Planning with a language for extended goals. In AAAI, 2002.Google Scholar
Dalal, M., et al. Imitating task and motion planning with visuomotor transformers. arXiv:2305.16309, 2023.Google Scholar
Daniele, M., et al. Strong cyclic planning revisited. In ECP, Sept. 1999.CrossRefGoogle Scholar
Dantam, N. T., et al. An incremental constraint-based framework for task and motion planning. IJRR, 2018.CrossRefGoogle Scholar
Dasgupta, I., et al. Collaborating with language models for embodied reasoning. arXiv:2302.00763, 2023.Google Scholar
Giacomo, G. De, et al. Automatic behavior composition synthesis. AIJ, 2013.Google Scholar
de Silva, L.. HTN acting: A formalism and an algorithm. In AAMAS, 2018.Google Scholar
De Silva, L. and Padgham, L.. A comparison of BDI based real-time reasoning and HTN based planning. In Australian Joint Conf. on AI, 2004.CrossRefGoogle Scholar
de Silva, L., et al. The HATP hierarchical planner: Formalisation and an initial study of its usability and practicality. In IROS, 2015.CrossRefGoogle Scholar
de Silva, L. D., et al. Towards combining HTN planning and geometric task planning. In RSS Wksh. on Combined Robot Motion Planning and AI Planning for Practical Applications, 2013.Google Scholar
de Souza, P. E. U., et al. MOMDP-based target search mission taking into account the human operator’s cognitive state. In ICTAI, 2015.CrossRefGoogle Scholar
Dean, T. and Kanazawa, K.. A model for reasoning about persistence and causation. CI, 1989.CrossRefGoogle Scholar
Dean, T. and Lin, S.-H.. Decomposition techniques for planning in stochastic domains. In IJCAI, 1995.Google Scholar
Dean, T. and McDermott, D.. Temporal data base management. AIJ, 1987.CrossRefGoogle Scholar
Dean, T., et al. Hierarchical planning involving deadlines, travel time and resources. CI, 1988.CrossRefGoogle Scholar
Dean, T., et al. Model reduction techniques for computing approximately optimal solutions for Markov decision processes. In UAI, 1997.Google Scholar
Dean, T. L. and Wellman, M.. Planning and Control. Morgan Kaufmann, 1991.CrossRefGoogle Scholar
Dechter, R., et al. Temporal constraint networks. AIJ, 1991.CrossRefGoogle Scholar
Degris, T., et al. Model-free reinforcement learning with continuous action in practice. In American Control Conf., 2012.CrossRefGoogle Scholar
Deisenroth, M. P., et al. A survey on policy search for robotics. Foundations and Trends in Robotics, 2013.Google Scholar
Del Moral, P.. Nonlinear filtering: Interacting particle resolution. CRAS1-Mathematics, 1997.CrossRefGoogle Scholar
Delamer, J.-A., et al. Safe path planning for UAV urban operation under gnss signal occlusion risk. RAS, 2021.CrossRefGoogle Scholar
den Hengst, F., et al. Planning for potential: Efficient safe reinforcement learning. ML, 2022.CrossRefGoogle Scholar
den Hengst, F., et al. Reinforcement learning with option machines. In IJCAI, 2022.CrossRefGoogle Scholar
D’Epenoux, F.. A probabilistic production and inventory problem. Management Science, 1963.CrossRefGoogle Scholar
Derman, C.. Finite State Markovian Decision Processes. Academic Press, 1970.Google Scholar
Despouys, O. and Ingrand, F.. Propice-Plan: Toward a unified framework for planning and execution. In ECP, 1999.CrossRefGoogle Scholar
Diaz, M., editor. Petri Nets: Fundamental Models, Verification and Applications. Wiley, 2009.CrossRefGoogle Scholar
Dietterich, T. G.. Hierarchical reinforcement learning with the MaxQ value function decomposition. JAIR, 2000.CrossRefGoogle Scholar
Do, M. B. and Kambhampati, S.. Planning as constraint satisfaction: Solving the planning graph by compiling it into CSP. AIJ, 2001.CrossRefGoogle Scholar
Do, M. B. and Kambhampati, S.. Sapa: A domain independent heuristic metric temporal planner. In ECP, 2001.Google Scholar
Doherty, P. and Kvarnström, J.. TALplanner: A temporal logic based planner. AIMag, 2001.Google Scholar
Doherty, P., et al. A temporal logic-based planning and execution monitoring framework for unmanned aircraft systems. JAAMAS, 2009.CrossRefGoogle Scholar
Donald, B. R. and Xavier, P.. Provably good approximation algorithms for optimal kinodynamic planning (1 & 2). Algorithmica, 1995.CrossRefGoogle Scholar
Dong, H., et al. Neural logic machines. arXiv:1904.11694, 2019.Google Scholar
Doran, J. E. and Michie, D.. Experiments with the graph traverser program. Proc. of the Royal Society A: Mathematics, Physics, and Engineering Sciences, 1966.Google Scholar
Dorf, R. C. and Bishop, R. H.. Modern Control Systems. Prentice Hall, 2010.Google Scholar
Dornhege, C., et al. Semantic attachments for domain-independent planning systems. In ICAPS, 2009.CrossRefGoogle Scholar
Dousson, C. and Le Maigat, P.. Chronicle recognition improvement using temporal focusing and hierarchization. In IJCAI, 2007.Google Scholar
Dousson, C., et al. Situation recognition: Representation and algorithms. In IJCAI, 1993.Google Scholar
Drakengren, T. and Jonsson, P.. Eight maximal tractable subclasses of Allen’s algebra with metric time. JAIR, 1997.CrossRefGoogle Scholar
Driess, D., et al. Deep visual heuristics: Learning feasibility of mixed-integer programs for manipulation planning. In ICRA, 2020.CrossRefGoogle Scholar
Driess, D., et al. PaLM-E: An embodied multimodal language model. arXiv:2303.03378, 2023.Google Scholar
Drougard, N., et al. Qualitative possibilistic mixed-observable MDPs. arXiv:1309.6826, 2013.Google Scholar
Du, Y., et al. Guiding pretraining in reinforcement learning with large language models. In ICML, 2023.Google Scholar
Duan, H., et al. Sim-to-real learning of footstep-constrained bipedal dynamic walking. arXiv:2203.07589, 2022.Google Scholar
Dutta, S., et al. Frugal LMs trained to invoke symbolic solvers achieve parameter-efficient arithmetic reasoning. In AAAI, 2024.CrossRefGoogle Scholar
Dvorak, F., et al. A flexible ANML actor and planner in robotics. In ICAPS Wksh. on Planning and Robotics, 2014.Google Scholar
Eaton, J. H. and Zadeh, L. A.. Optimal pursuit strategies in discrete state probabilistic systems. Trans. of the ASME, 1962.CrossRefGoogle Scholar
Ecoffet, A., et al. Go-Explore: A new approach for hard-exploration problems. arXiv:1901.10995, 2021.Google Scholar
Edelkamp, S.. Planning with pattern databases. In ECP, 2001.Google Scholar
Edelkamp, S.. Symbolic pattern databases in heuristic search planning. In AIPS, 2002.Google Scholar
Edelkamp, S.. Taming numbers and durations in the model checking integrated planning system. JAIR, 2003.CrossRefGoogle Scholar
Effinger, R., et al. Dynamic execution of temporally and spatially flexible reactive programs. In AAAI Wksh. on Bridging the Gap between Task and Motion Planning, 2010.Google Scholar
El-Kholy, A. and Richard, B.. Temporal and resource reasoning in planning: The ParcPlan approach. In ECAI, 1996.Google Scholar
Elkawkagy, M., et al. Improving hierarchical planning performance by the use of landmarks. In AAAI, 2021.CrossRefGoogle Scholar
Emerson, E. A.. Temporal and modal logic. In Handbook of Theoretical Computer Science. Elsevier, 1990.Google Scholar
Erdem, E., et al. Answer set programming for collaborative housekeeping robotics: Representation, reasoning, and execution. Intelligent Service Robotics, Oct. 2012.CrossRefGoogle Scholar
Erol, K., et al. Semantics for hierarchical task-network planning. Technical Report CS TR-3239, Univ. of Maryland, 1994.Google Scholar
Erol, K., et al. UMCP: A sound and complete procedure for hierarchical task-network planning. In AIPS, June 1994.Google Scholar
Erol, K., et al. Complexity, decidability and undecidability results for domain-independent planning. AIJ, 1995.CrossRefGoogle Scholar
Erol, K., et al. Complexity results for HTN planning. AMAI, 1996.CrossRefGoogle Scholar
Escalada-Imaz, G. and Ghallab, M.. A practically efficient and almost linear unification algorithm. AIJ, 1988.CrossRefGoogle Scholar
Estlin, T. A. and Mooney, R. J.. Learning to improve both efficiency and quality of planning. In IJCAI, 1997.Google Scholar
Estlin, T. A., et al. An argument for a hybrid HTN/operator-based approach to planning. In ECP, 1997.CrossRefGoogle Scholar
Estrada, C., et al. Hierarchical SLAM: Real-time accurate mapping of large environments. TRA, 2005.CrossRefGoogle Scholar
Etzioni, O., et al. An approach to planning with incomplete information. In KR, 1992.Google Scholar
High-Level Expert Group on Artificial Intelligence, ethics guidelines for trustworthy AI, 2019, in https://digitalstrategy.ec.europa.eu/en/library/ethicsguidelines-trustworthy-aiGoogle Scholar
Eyerich, P., et al. Using the contextenhanced additive heuristic for temporal and numeric planning. In ICAPS, 2009.CrossRefGoogle Scholar
Fakoor, R., et al. Meta-Q-learning. arXiv:1910.00125, 2019.Google Scholar
Fan, J., et al. A theoretical analysis of deep Q-learning. arXiv:1901.00137, 2020.Google Scholar
Fan, Y., et al. Heterogeneous temporal graph neural network. In SIAM Int. Conf. on Data Mining (SDM), 2022.CrossRefGoogle Scholar
Farahbod, R., et al. Specification and validation of the business process execution language for web services. In Int. Wksh. on Abstract State Machines, 2004.CrossRefGoogle Scholar
Fargier, H., et al. Using temporal constraint networks to manage temporal scenario of multimedia documents. In ECAI Wksh. on Spatial and Temporal Reasoning, 1998.Google Scholar
Faure, F., et al. Sofa: A multi-model framework for interactive physical simulation. In Soft Tissue Biomechanical Modeling for Computer Assisted Surgery. Springer, 2012.Google Scholar
Feng, Z. and Hansen, E. A.. Symbolic heuristic search for factored Markov decision processes. In AAAI, 2002.Google Scholar
Feng, Z., et al. Symbolic generalization for on-line planning. In UAI, 2002.Google Scholar
Feng, Z., et al. Dynamic programming for structured continuous Markov decision problems. In AAAI, 2004.Google Scholar
Ferber, P., et al. Neural network heuristics for classical planning: A study of hyperparameter space. In ECAI, 2020.Google Scholar
Ferber, P., et al. Neural network heuristic functions for classical planning: Bootstrapping and comparison to other methods. In ICAPS, 2022.CrossRefGoogle Scholar
Ferguson, D., et al. Motion planning in urban environments. JFR, 2008.CrossRefGoogle Scholar
Ferguson, D. I. and Stentz, A.. Focussed propagation of MDPs for path planning. In ICTAI, 2004.Google Scholar
Fernández, F. and Veloso, M. M.. Probabilistic policy reuse in a reinforcement learning agent. In AAMAS, 2006.CrossRefGoogle Scholar
Feron, E. and Johnson, E. N.. Aerial robotics. In Handbook of Robotics. Springer, 2008.Google Scholar
Ferraris, P. and Giunchiglia, E.. Planning as satisfiability in nondeterministic domains. In AAAI, 2000.Google Scholar
Ferrein, A. and Lakemeyer, G.. Logicbased robot control in highly dynamic domains. RAS, 2008.CrossRefGoogle Scholar
Ferrer-Mestres, J., et al. Combined task and motion planning as classical AI planning. arXiv:1706.06927, 2017.Google Scholar
Ferret, J., et al. Credit assignment as a proxy for transfer in reinforcement learning. arXiv:1907.08027, 2019.Google Scholar
Ferrucci, D. A., et al. Building Watson: An Overview of the DeepQA Project. AI Mag., 2010.CrossRefGoogle Scholar
Fichtner, M., et al. Intelligent execution monitoring in dynamic environments. Fundamenta Informaticae, 2003.Google Scholar
Fikes, R. E.. Monitored execution of robot plans produced by STRIPS. In IFIP Congress, 1971.CrossRefGoogle Scholar
Fikes, R. E. and Nilsson, N. J.. STRIPS: A new approach to the application of theorem proving to problem solving. AIJ, 1971.CrossRefGoogle Scholar
Fikes, R. E., et al. Learning and executing generalized robot plans. AIJ, 1972.CrossRefGoogle Scholar
Finn, C., et al. Guided cost learning: Deep inverse optimal control via policy optimization. In ML, 2016.Google Scholar
Finzi, A., et al. Open world planning in the situation calculus. In AAAI, 2000.Google Scholar
Firby, R. J.. An investigation into reactive planning in complex domains. In AAAI, 1987.Google Scholar
Fisher, M., et al., editors. Handbook of Temporal Reasoning in Artificial Intelligence. Elsevier, 2005.Google Scholar
Flórez-Puga, G., et al. Query-enabled behavior trees. IEEE Trans. on Computational Intelligence and AI in Games, 2009.CrossRefGoogle Scholar
Floyd, R. W.. Nondeterministic algorithms. JACM, 1967.CrossRefGoogle Scholar
Foka, A. and Trahanias, P.. Real-time hierarchical POMDPs for autonomous robot navigation. RAS, 2007.CrossRefGoogle Scholar
Forestier, J. P. and Varaiya, P.. Multilayer control of large Markov chains. IEEE Trans. on Automation and Control, 1978.CrossRefGoogle Scholar
Fox, M. and Long, D.. Utilizing automatically inferred invariants in graph construction and search. In ICAPS, 2000.Google Scholar
Fox, M. and Long, D.. PDDL2.1: An extension to PDDL for expressing temporal planning domains. JAIR, 2003.CrossRefGoogle Scholar
Fox, M., et al. Plan stability: Replanning versus plan repair. In ICAPS, 2006.Google Scholar
Frank, J. and Jónsson, A. K.. Constraintbased attribute and interval planning. Constraints, 2003.Google Scholar
Frank, J., et al. When gravity fails: Local search topology. JAIR, 1997.CrossRefGoogle Scholar
Fraser, G., et al. Plan execution in dynamic environments. In Int. Cognitive Robotics Wksh. 2004.Google Scholar
Fratini, S., et al. APSI-based deliberation in goal oriented autonomous controllers. In Symposium on Advances in Space Technologies in Robotics and Automation, 2011.Google Scholar
Fu, J., et al. Simple and fast strong cyclic planning for fully-observable nondeterministic planning problems. In IJCAI, 2011.Google Scholar
Fusier, F., et al. Video understanding for complex activity recognition. Machine Vision and Applications, 2007.CrossRefGoogle Scholar
Futur of Life Institute. Lethal Autonomous Weapons Pledge, 2018.Google Scholar
Gao, Y., et al. Retrieval-augmented generation for large language models: A survey. arXiv:2312.10997, 2023.Google Scholar
Garcia, C. E., et al. Model predictive control: Theory and practice – a survey. Automatica, 1989.CrossRefGoogle Scholar
Garcia, F. and Laborie, P.. Hierarchisation of the search space in temporal planning. In European Wksh. on Planning, 1995.Google Scholar
García-Martínez, R. and Borrajo, D.. An integrated approach of learning, planning, and execution. JIRS, 2000.Google Scholar
Garnelo, M., et al. Towards deep symbolic reinforcement learning. arXiv:1609.05518, 2016.Google Scholar
Garrett, C. R., et al. Backward-forward search for manipulation planning. In IROS, 2015.CrossRefGoogle Scholar
Garrett, C. R., et al. Learning to rank for synthesizing planning heuristics. In IJCAI, 2016.Google Scholar
Garrett, C. R., et al. Ffrob: Leveraging symbolic planning for efficient task and motion planning. arXiv:1608.01335, 2016.Google Scholar
Garrett, C. R., et al. Integrated task and motion planning. ARCRAS, 2021.CrossRefGoogle Scholar
Garrido, A.. A temporal planning system for level 3 durative actions of PDDL2.1. In AIPS Wksh. on Planning for Temporal Domains, 2002.Google Scholar
Garrido, A. and Jiménez, S.. Learning temporal action models via constraint programming. In ECAI, 2020.Google Scholar
Gedicke, T., et al. Flap for CAOS: Forwardlooking active perception for clutter-aware object search. IFAC Symposium on Intelligent Autonomous Vehicles, 2016.Google Scholar
Geffner, H.. Functional STRIPS: A more flexible language for planning and problem solving. In Logic-Based Artificial Intelligence. Kluwer, 2000.Google Scholar
Geffner, H. and Bonet, B.. A Concise Introduction to Models and Methods for Automated Planning. Morgan & Claypool, 2013.CrossRefGoogle Scholar
Geffner, T. and Geffner, H.. Compact policies for fully observable non-deterministic planning as SAT. In ICAPS, 2018.CrossRefGoogle Scholar
Gehring, C., et al. Reinforcement learning for classical planning: Viewing heuristics as dense reward generators. In ICAPS, 2022.CrossRefGoogle Scholar
Geib, C. and Goldman, R. P.. A probabilistic plan recognition algorithm based on plan tree grammars. AIJ, 2009.CrossRefGoogle Scholar
Geier, T. and Bercher, P.. On the decidability of HTN planning with task insertion. In IJCAI, 2011.Google Scholar
Gerevini, A. and Schubert, L.. Accelerating partial-order planners: Some techniques for effective search control and pruning. JAIR, 1996.CrossRefGoogle Scholar
Gerevini, A. and Schubert, L.. Discovering state constraints in DISCOPLAN: Some new results. In AAAI, Aug. 2000.Google Scholar
Gerevini, A. and Serina, I.. LPG: A planner based on local search for planning graphs. In AIPS, 2002.CrossRefGoogle Scholar
Gerevini, A., et al. Planning through stochastic local search and temporal action graphs in LPG. JAIR, 2003.CrossRefGoogle Scholar
Gerevini, A., et al. Integrating planning and temporal reasoning for domains with durations and time windows. In IJCAI, 2005.Google Scholar
Gerevini, A., et al. Combining domainindependent planning and HTN planning: The Duet planner. In ECAI, 2008.Google Scholar
Ghahramani, Z.. Learning dynamic bayesian networks. In Summer School on Neural Networks, 1997.CrossRefGoogle Scholar
Ghallab, M.. On chronicles: Representation, on-line recognition and learning. In KR, 1996.Google Scholar
Ghallab, M.. Responsible AI: Requirements and challenges. AI Perspectives, 2019.CrossRefGoogle Scholar
Ghallab, M. and Laruelle, H.. Representation and control in IxTeT, a temporal planner. In AIPS, 1994.Google Scholar
Ghallab, M. and Mounir-Alaoui, A.. Managing efficiently temporal relations through indexed spanning trees. In IJCAI, 1989.Google Scholar
Ghallab, M., et al. Dealing with time in planning and execution monitoring. In ISRR, 1987.Google Scholar
Ghallab, M., et al. PDDL–the planning domain definition language. Technical Report TR-98-003/TR-1165, Yale Center for Computational Vision and Control, 1998.Google Scholar
Ghallab, M., et al. Automated Planning: Theory and Practice. Morgan Kaufmann, Oct. 2004.Google Scholar
Ghallab, M., et al. Automated Planning and Acting. Cambridge University Press, 2016.CrossRefGoogle Scholar
Gharbi, M., et al. Combining symbolic and geometric planning to synthesize human-aware plans: Toward more efficient combined search. In IROS, 2015.CrossRefGoogle Scholar
Ghiasi, G., et al. Scaling open-vocabulary image segmentation with image-level labels. In ECCV, 2022.CrossRefGoogle Scholar
Ghosh, A., et al. Exploring the frontier of vision-language models: A survey of current methodologies and future directions. arXiv:2404.07214, 2024.Google Scholar
Ghosh, S., et al. ITS: An efficient limitedmemory heuristic tree search algorithm. In AAAI, 1994.Google Scholar
Giacomo, G. D. and Favorito, M.. Compositional approach to translate LTLf/LDLf into deterministic finite automata. In ICAPS, 2021.Google Scholar
Giacomo, G. D. and Rubin, S.. Automatatheoretic foundations of FOND planning for LTLf and LDLf goals. In IJCAI, 2018.Google Scholar
Giacomo, G. D. and Vardi, M. Y.. Linear temporal logic and linear dynamic logic on finite traces. In IJCAI, 2013.Google Scholar
Giacomo, G. D., et al. Timed trace alignment with metric temporal logic over finite traces. In KR, 2021.Google Scholar
Giacomo, G. D., et al. LTLf synthesis as AND-OR graph search: Knowledge compilation at work. In IJCAI, 2022.Google Scholar
Gil, Y.. Learning by experimentation: Incremental refinement of incomplete planning domains. In ICML, 1994.CrossRefGoogle Scholar
Gini, M. L., et al. Advances in autonomous robots for service and entertainment. RAS, 2010.CrossRefGoogle Scholar
Giunchiglia, E.. Planning as satisfiability with expressive action languages: Concurrency, constraints and nondeterminism. In KR, 2000.Google Scholar
Giunchiglia, F.. Using ABSTRIPS abstractions – where do we stand? AI Review, 1999.Google Scholar
Giunchiglia, F. and Traverso, P.. Planning as model checking. In ECP, Sept. 1999.CrossRefGoogle Scholar
Givan, R., et al. Equivalence notions and model minimization in Markov decision processes. AIJ, 2003.CrossRefGoogle Scholar
Glaese, A., et al. Improving alignment of dialogue agents via targeted human judgements. arXiv:2209.14375, 2022.Google Scholar
Gold, E. M.. Complexity of automaton identification from given data. Inf. Control., 1978.CrossRefGoogle Scholar
Golden, K., et al. Omnipotence without omniscience: Efficient sensor management for planning. In AAAI, 1994.Google Scholar
Goldman, R.. A semantics for HTN methods. In ICAPS, 2009.CrossRefGoogle Scholar
Goldman, R., et al. Hard real-time mode logic synthesis for hybrid control: A CIRCA-based approach. In AAAI Spring Symposium on Hybrid Systems and AI, Mar. 1999. AAAI Tech. Report SS-99-05.Google Scholar
Goldman, R., et al. A comparative analysis of plan repair in HTN planning. In HPlan, 2024.Google Scholar
Goldman, R. P. and Kuter, U.. Hierarchical task network planning in Common Lisp: The case of SHOP3. In European Lisp Symposium, 2019.Google Scholar
Goldman, R. P., et al. Dynamic abstraction planning. In AAAI, 1997.Google Scholar
Goldman, R. P., et al. Using model checking to plan hard real-time controllers. In AIPS Wksh. on Model-Theoretic Approaches to Planning, 2000.Google Scholar
Goldman, R. P., et al. Stable plan repair for state-space HTN planning. In HPlan, 2020.Google Scholar
Golumbic, M. and Shamir, R.. Complexity and algorithms for reasoning about time: A graph-theoretic approach. JACM, 1993.CrossRefGoogle Scholar
Goodfellow, I., et al. Deep Learning. MIT Press, 2016.Google Scholar
Goodfellow, I., et al. Generative adversarial networks. CACM, 2020.CrossRefGoogle Scholar
Gopal, M.. Control Systems: Principles and Design. McGraw-Hill, 1963.Google Scholar
Goyal, N. and Steiner, D.. Graph neural networks for image classification and reinforcement learning using graph representations. arXiv:2203.03457, 2022.Google Scholar
Goyal, P., et al. Using natural language for reward shaping in reinforcement learning. arXiv:1903.02020, 2019.Google Scholar
Gragera, A., et al. A planning approach to repair domains with incomplete action effects. In ICAPS, 2023.CrossRefGoogle Scholar
Grand, M., et al. Tempamlsi: Temporal action model learning based on STRIPS translation. In ICAPS, 2022.CrossRefGoogle Scholar
Greenberg, I., et al. Train hard, fight easy: Robust meta reinforcement learning. In NeurIPS, 2023.Google Scholar
Gregory, P., et al. A meta-CSP model for optimal planning. In Abstraction, Reformulation, and Approximation. Springer, 2007.Google Scholar
Gregory, P., et al. Planning modulo theories: Extending the planning paradigm. In ICAPS, 2012.CrossRefGoogle Scholar
Grondman, I., et al. A survey of actor-critic reinforcement learning: Standard and natural policy gradients. SMC, 2012.CrossRefGoogle Scholar
Gu, S., et al. Continuous deep Q-learning with model-based acceleration. In ICML, 2016.Google Scholar
Gu, S., et al. TeaMs-RL: Teaching LLMs to teach themselves better instructions via reinforcement learning. arXiv:2403.08694, 2024.Google Scholar
Guan, L., et al. Leveraging pre-trained large language models to construct and utilize world models for model-based task planning. arXiv:2305.14909, 2023.Google Scholar
Guestrin, C., et al. Efficient solution algorithms for factored MDPs. JAIR, 2003.CrossRefGoogle Scholar
Guestrin, C., et al. Solving factored MDPs with continuous and discrete variables. In UAI, 2004.Google Scholar
Guez, A. and Pineau, J.. Multi-tasking SLAM. In ICRA, 2010.CrossRefGoogle Scholar
Guizzo, E.. Kiva Systems. IEEE Spectrum, July 2008.Google Scholar
Gupta, A., et al. Relay policy learning: Solving long-horizon tasks via imitation and reinforcement learning. arXiv:1910.11956, 2019.Google Scholar
Gutstein, S. and Stump, E.. Reduction of catastrophic forgetting with transfer learning and ternary output codes. In IJCNN, 2015.CrossRefGoogle Scholar
Hafiz, A.. A survey of deep Q-networks used for reinforcement learning: State of the art. Intelligent Communication Technologies and Virtual Mobile Networks, 2023.CrossRefGoogle Scholar
Hägele, M., et al. Industrial robotics. In Handbook of Robotics. Springer, 2008.Google Scholar
Hahn, M.. Theoretical limitations of self-attention in neural sequence models. arXiv:1906.06755, 2020.Google Scholar
Hähnel, D., et al. GOLEX—bridging the gap between logic (GOLOG) and a real robot. In KI. 1998.CrossRefGoogle Scholar
Hamilton, W. L.. Graph Representation Learning. Morgan & Claypool, 2020.CrossRefGoogle Scholar
Hammond, K. J.. Explaining and repairing plans that fail. AIJ, 1990.CrossRefGoogle Scholar
Hanks, S. and Firby, R. J.. Issues and architectures for planning and execution. In Wksh. on Innovative Approaches to Planning, Scheduling and Control, 1990.Google Scholar
Hanks, S. and Weld, D. S.. A domainindependent algorithm for plan adaptation. JAIR, 1995.CrossRefGoogle Scholar
Hansen, E. A.. Indefinite-horizon POMDPs with action-based termination. In AAAI, 2007.Google Scholar
Hansen, E. A.. Suboptimality bounds for stochastic shortest path problems. In UAI, 2011.Google Scholar
Hansen, E. A. and Zhou, R.. Anytime heuristic search. JAIR, 2007.CrossRefGoogle Scholar
Hansen, E. A. and Zilberstein, S.. LAO*: A heuristic search algorithm that finds solutions with loops. AIJ, 2001.CrossRefGoogle Scholar
Hart, P. and Knoll, A.. Graph neural networks and reinforcement learning for behavior generation in semantic environments. In IEEE Intelligent Vehicles Symposium, 2020.CrossRefGoogle Scholar
Hart, P. E., et al. A formal basis for the heuristic determination of minimum cost paths. SMC, 1968.CrossRefGoogle Scholar
Hart, P. E., et al. Correction to a formal basis for the heuristic determination of minimum cost paths. SIGART Bulletin, 1972.CrossRefGoogle Scholar
Harutyunyan, A., et al. Hindsight credit assignment. In NeurIPS, 2019.Google Scholar
Hasanbeig, M., et al. Deepsynth: Automata synthesis for automatic task segmentation in deep reinforcement learning. In AAAI, 2021.CrossRefGoogle Scholar
Haslum, P.. Admissible makespan estimates for PDDL2.1 temporal planning. In HDIP, 2009.Google Scholar
Haslum, P. and Geffner, H.. Admissible heuristics for optimal planning. In AIPS, 2000.Google Scholar
Haslum, P. and Geffner, H.. Heuristic plannnig with time and resources. In ECP, 2001.Google Scholar
Haslum, P., et al. New admissible heuristics for domain-independent planning. In AAAI, 2005.Google Scholar
Haslum, P., et al. Domain-independent construction of pattern database heuristics for cost-optimal planning. In AAAI, 2007.Google Scholar
Haslum, P., et al. Extending classical planning with state constraints: Heuristics and search for optimal planning. JAIR, 2018.CrossRefGoogle Scholar
Haslum, P., et al. An Introduction to the Planning Domain Definition Language. Synthesis Lectures on AI and ML. Morgan & Claypool, 2019.CrossRefGoogle Scholar
Hauser, K.. Task planning with continuous actions and nondeterministic motion planning queries. In AAAI Wksh. on Bridging the Gap between Task and Motion Planning, 2010.Google Scholar
Hauser, K. and Latombe, J.-C.. Integrating task and PRM motion planning: Dealing with many infeasible motion planning queries. In ICAPS, 2009.Google Scholar
Hauskrecht, M. and Kveton, B.. Linear program approximations for factored continuous-state markov decision processes. In NeurIPS, 2003.Google Scholar
Hauskrecht, M., et al. Hierarchical solution of Markov decision processes using macro-actions. In UAI, 1998.Google Scholar
Hawes, N.. A survey of motivation frameworks for intelligent systems. AIJ, 2011.CrossRefGoogle Scholar
Hayet, J.-B., et al. Motion planning for maintaining landmarks visibility with a differential drive robot. RAS, 2014.CrossRefGoogle Scholar
Heintz, F., et al. Bridging the sensereasoning gap: DyKnow – stream-based middleware for knowledge processing. Advanced Engineering Informatics, 2010.CrossRefGoogle Scholar
Helmert, M.. Decidability and undecidability results for planning with numerical state variables. In AIPS, 2002.Google Scholar
Helmert, M.. The Fast Downward planning system. JAIR, 2006.CrossRefGoogle Scholar
Helmert, M.. Concise finite-domain representations for PDDL planning tasks. AIJ, 2009.CrossRefGoogle Scholar
Helmert, M. and Domshlak, C.. Landmarks, critical paths and abstractions: What’s the difference anyway? In ICAPS, 2009.CrossRefGoogle Scholar
Helmert, M. and Geffner, H.. Unifying the causal graph and additive heuristics. In ICAPS, 2008.Google Scholar
Helmert, M., et al. Flexible abstraction heuristics for optimal sequential planning. In ICAPS, 2007.Google Scholar
Helmert, M., et al. Explicit-state abstraction: A new method for generating heuristic functions. In AAAI, 2008.Google Scholar
Helmert, M., et al. Merge-and-shrink abstraction: A method for generating lower bounds in factored state spaces. JACM, 2014.CrossRefGoogle Scholar
Helmert, M., et al. On the complexity of heuristic synthesis for satisficing classical planning: Potential heuristics and beyond. In ICAPS, 2022.CrossRefGoogle Scholar
Hérail, P.. Learning hierarchical models from demonstrations for deliberate planning and acting. PhD thesis, Univ. of Toulouse, 2024.Google Scholar
Hester, T. and Stone, P.. TEXPLORE: Real-time sample-efficient reinforcement learning for robots. ML, 2013.CrossRefGoogle Scholar
Hester, T., et al. Deep Q-learning from demonstrations. In AAAI, 2018.CrossRefGoogle Scholar
Hitzler, P.. A review of the semantic web field. CACM, 2021.CrossRefGoogle Scholar
Hitzler, P. and Wendt, M.. A uniform approach to logic programming semantics. Theory and Practice of Logic Programming, 2005.CrossRefGoogle Scholar
Ho, J. and Ermon, S.. Generative adversarial imitation learning. In NeurIPS, 2016.Google Scholar
Hoang, H., et al. Hierarchical plan representations for encoding strategic game AI. In AAAI Conference on AI and Interactive Digital Entertainment, 2005.Google Scholar
Hoey, J., et al. SPUDD: Stochastic planning using decision diagrams. In UAI, 1999.Google Scholar
Hoffmann, J.. The metric-FF planning system: Translating “ignoring delete lists” to numeric state variables. JAIR, 2003.CrossRefGoogle Scholar
Hoffmann, J.. Where “ignoring delete lists” works: Local search topology in planning benchmarks. JAIR, 2005.CrossRefGoogle Scholar
Hoffmann, J. and Brafman, R.. Contingent planning via heuristic forward search with implicit belief states. In ICAPS, 2005.Google Scholar
Hoffmann, J. and Nebel, B.. The FF planning system: Fast plan generation through heuristic search. JAIR, 2001.CrossRefGoogle Scholar
Hoffmann, J., et al. Ordered landmarks in planning. JAIR, 2004.CrossRefGoogle Scholar
Hogg, C., et al. Learning hierarchical task models from input traces. CI, 2016.Google Scholar
Höller, D.. Translating totally ordered HTN planning problems to classical planning problems using regular approximation of context-free languages. In ICAPS, 2021.CrossRefGoogle Scholar
Höller, D., et al. Assessing the expressivity of planning formalisms through the comparison to formal languages. In ICAPS, 2016.CrossRefGoogle Scholar
Höller, D., et al. A generic method to guide HTN progression search with classical heuristics. In ICAPS, 2018.CrossRefGoogle Scholar
Höller, D., et al. HDDL: An extension to PDDL for expressing hierarchical planning problems. In AAAI, 2020.CrossRefGoogle Scholar
Höller, D., et al. HTN plan repair via model transformation. In KI, 2020.CrossRefGoogle Scholar
Höller, D., et al. Compiling HTN plan verification problems into HTN planning problems. In ICAPS, 2022.CrossRefGoogle Scholar
Hongeng, S., et al. Video-based event recognition: Activity representation and probabilistic recognition methods. Computer Vision and Image Understanding, 2004.CrossRefGoogle Scholar
Hooker, J. N.. Operations research methods in constraint programming. In Handbook of Constraint Programming. Elsevier, 2006.Google Scholar
Horling, B., et al. Distributed sensor network for real time tracking. In AAMAS, 2001.CrossRefGoogle Scholar
Horowitz, S. S. E. and Rajasakaran, S.. Computer Algorithms. W.Freeman, H., 1996.Google Scholar
Howard, R. A.. Dynamic Probabilistic Systems. Wiley, 1971.Google Scholar
Hu, H. and Sadigh, D.. Language instructed reinforcement learning for human-ai coordination. In ICML, 2023.Google Scholar
Hu, W., et al. Bidirectional projection network for cross dimension scene understanding. In CVPR, 2021.CrossRefGoogle Scholar
Hu, Y., et al. What can knowledge bring to machine learning?—a survey of low-shot learning for structured data. ACM Trans. on Intelligent Systems and Technology, 2022.CrossRefGoogle Scholar
Huang, B., et al. AdaRL: What, where, and how to adapt in transfer reinforcement learning. In ICLR, 2022.Google Scholar
Huang, J. and Chang, K. C.-C.. Towards reasoning in large language models: A survey. arXiv:2212.10403, 2022.Google Scholar
Huang, R., et al. An optimal temporally expressive planner: Initial results and application to P2P network optimization. In ICAPS, 2009.CrossRefGoogle Scholar
Huang, R., et al. SAS+ planning as satisfiability. JAIR, 2012.CrossRefGoogle Scholar
Huang, W., et al. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In ICML, 2022.Google Scholar
Hwang, I., et al. A survey of fault detection, isolation, and reconfiguration methods. IEEE Trans. on Control Systems Technology, 2010.CrossRefGoogle Scholar
Ibaraki, T.. Theoretical comparision of search strategies in branch and bound. Int. Journal of Computer and Information Sciences, 1976.CrossRefGoogle Scholar
Ichter, B., et al. Learning sampling distributions for robot motion planning. In ICRA, 2018.CrossRefGoogle Scholar
Ilghami, O. and Nau, D. S.. A general approach to synthesize problem-specific planners. Technical report, Univ. of Maryland, CS-TR-4597, 2003.CrossRefGoogle Scholar
Ingham, M. D., et al. A reactive modelbased programming language for robotic space explorers. In i-SAIRAS, 2001.Google Scholar
Ingrand, F.. ProSkill: A formal skill language for acting in robotics. arXiv:2403.07770, 2024.Google Scholar
Ingrand, F. and Despouys, O.. Extending procedural reasoning toward robot actions planning. In ICRA, 2001.Google Scholar
Ingrand, F. and Ghallab, M.. Deliberation for autonomous robots: A survey. AIJ, 2017.CrossRefGoogle Scholar
Ingrand, F., et al. PRS: A high level supervision and control language for autonomous mobile robots. In ICRA, 1996.Google Scholar
Ioffe, S. and Szegedy, C.. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.Google Scholar
Iovino, M., et al. A survey of behavior trees in robotics and AI. RAS, 2022.CrossRefGoogle Scholar
Iovino, M., et al. A framework for learning behavior trees in collaborative robotic applications. In IEEE Int. Conf. on Automation Science and Engineering, 2023.CrossRefGoogle Scholar
Isla, D.. Handling complexity in the Halo 2 AI. In GDC, 2005.Google Scholar
Iwen, M. and Mali, A. D.. Distributed graphplan. In ICTAI, 2002.Google Scholar
Jahangirian, M., et al. Simulation in manufacturing and business: A review. European Jour. of Operational Research, 2010.CrossRefGoogle Scholar
Jaillet, L. and Siméon, T.. A PRM-based motion planner for dynamically changing environments. In IROS, 2004.Google Scholar
Jensen, K.. Coloured Petri Nets. Springer, 1992.CrossRefGoogle Scholar
Jensen, K., et al. Coloured Petri nets and CPN tools for modelling and validation of concurrent systems. Int. Jour. on Software Tools for Technology Transfer, 2007.CrossRefGoogle Scholar
Jensen, R. and Veloso, M.. OBDD-based universal planning for synchronized agents in non-deterministic domains. JAIR, 2000.CrossRefGoogle Scholar
Jensen, R., et al. Guided symbolic universal planning. In ICAPS, June 2003.Google Scholar
Jensen, R. M., et al. OBDD-based optimistic and strong cyclic adversarial planning. In ECP, 2001.Google Scholar
Ji, S., et al. 3D convolutional neural networks for human action recognition. PAMI, 2012.CrossRefGoogle Scholar
Jiang, Y., et al. Language as an abstraction for hierarchical deep reinforcement learning. In NeurIPS, 2019.Google Scholar
Jiang, Z., et al. Active retrieval augmented generation. arXiv:2305.06983, 2023.Google Scholar
Jiménez, S., et al. A review of machine learning for automated planning. KER, 2012.CrossRefGoogle Scholar
Jo, S. and Trummer, I.. Smart: Automatically scaling down language models with accuracy guarantees for reduced processing fees. arXiv:2403.13835, 2024.Google Scholar
Jónsson, A. K., et al. Planning in interplanetary space: Theory and practice. In AIPS, 2000.Google Scholar
Jonsson, P., et al. Computational complexity of relating time points and intervals. AIJ, 1999.CrossRefGoogle Scholar
Jordan, M. and Perez, A.. Optimal bidirectional rapidly-exploring random trees. Technical report, MIT-CSAIL-TR-021, 2013.Google Scholar
Jovanović, M. and Voss, P.. Towards incremental learning in large language models: A critical review, 2024. Online report.Google Scholar
Juba, B. and Stern, R.. Learning probably approximately complete and safe action models for stochastic worlds. In AAAI, 2022.CrossRefGoogle Scholar
Juba, B., et al. Safe learning of lifted action models. In KR, 2021.CrossRefGoogle Scholar
Kabanza, F., et al. Planning control rules for reactive agents. AIJ, 1997.CrossRefGoogle Scholar
Kaelbling, L. P.. Learning to achieve goals. In IJCAI, 1993.Google Scholar
Kaelbling, L. P. and Lozano-Perez, T.. Hierarchical task and motion planning in the now. In ICRA, 2011.CrossRefGoogle Scholar
Kaelbling, L. P. and Lozano-Perez, T.. Integrated task and motion planning in belief space. IJRR, 2013.CrossRefGoogle Scholar
Kaelbling, L. P. and Lozano-Perez, T.. Implicit belief-space pre-images for hierarchical planning and execution. In ICRA, 2016.CrossRefGoogle Scholar
Kaelbling, L. P., et al. Reinforcement learning: A survey. JAIR, 1996.CrossRefGoogle Scholar
Kaelbling, L. P., et al. Planning and acting in partially observable stochastic domains. AIJ, 1998.CrossRefGoogle Scholar
Kakade, S. and Langford, J.. Approximately optimal approximate reinforcement learning. In ICML, 2002.Google Scholar
Kambhampati, S.. On the utility of systematicity: Understanding the trade-offs between redundancy and commitment in partial-order planning. In IJCAI, 1993.Google Scholar
Kambhampati, S.. Refinement planning as a unifying framework for plan synthesis. AIMag, 1997.Google Scholar
Kambhampati, S.. On the relations between intelligent backtracking and failure-driven explanation-based learning in constraint satisfaction and planning. AIJ, 1998.CrossRefGoogle Scholar
Kambhampati, S.. Are we comparing Dana and Fahiem or SHOP and TLPlan? A critique of the knowledge-based planning track at ICP. In ICAPS Wksh. on the Competition, 2003.Google Scholar
Kambhampati, S.. Polanyi’s revenge and AI’s new romance with tacit knowledge. CACM, 2021.CrossRefGoogle Scholar
Kambhampati, S. and Hendler, J. A.. A validation-structure-based theory of plan modification and reuse. AIJ, 1992.CrossRefGoogle Scholar
Kambhampati, S. and Srivastava, B.. Universal classical planner: An algorithm for unifying state-space and plan-space planning. In ECP, 1995.Google Scholar
Kambhampati, S. and Yoon, S. W.. Explanation-based learning for planning. In Encyclopedia of Machine Learning. Springer, 2010.Google Scholar
Kambhampati, S., et al. Failure driven dynamic search control for partial order planners: An explanation based approach. AIJ, 1996.CrossRefGoogle Scholar
Kambhampati, S., et al. Hybrid planning for partially hierarchical domains. In AAAI, 1998.Google Scholar
Kanoun, O., et al. Planning foot placements for a humanoid robot: A problem of inverse kinematics. IJRR, 2011.CrossRefGoogle Scholar
Karabaev, E. and Skvortsova, O.. A heuristic search algorithm for solving first-order MDPs. In UAI, 2005.Google Scholar
Karaman, S. and Frazzoli, E.. Samplingbased algorithms for optimal motion planning. IJRR, 2011.CrossRefGoogle Scholar
Karia, R. and Srivastava, S.. Learning generalized relational heuristic networks for model-agnostic planning. In AAAI, 2021.CrossRefGoogle Scholar
Karlsson, L., et al. To secure an anchor – A recovery planning approach to ambiguity in perceptual anchoring. AI Communincations, 2008.Google Scholar
Karpas, E., et al. Temporal landmarks: What must happen, and when. In ICAPS, 2015.CrossRefGoogle Scholar
Katz, M. and Domshlak, C.. Optimal additive composition of abstraction-based admissible heuristics. In ICAPS, 2008.Google Scholar
Katz, M. and Domshlak, C.. Structuralpattern databases. In ICAPS, 2009.CrossRefGoogle Scholar
Kautz, H. and Allen, J.. Generalized plan recognition. In AAAI, 1986.Google Scholar
Kautz, H. and Selman, B.. Pushing the envelope: Planning, propositional logic, and stochastic search. In AAAI, 1996.Google Scholar
Kautz, H. A., et al., editors. Synthesis and Planning, Dagstuhl Seminar Proceedings, 2006.Google Scholar
Kavraki, L. and Latombe, J.-C.. Randomized preprocessing of configuration for fast path planning. In ICRA, 1994.Google Scholar
Kazerooni, H.. Exoskeletons for human performance augmentation. In Handbook of Robotics. Springer, 2008.Google Scholar
Kearns, M., et al. A sparse sampling algorithm for near-optimal planning in large Markov decision processes. ML, 2002.Google Scholar
Kelleher, G. and Cohn, A. G.. Automatically synthesising domain constraints from operator descriptions. In ECAI, 1992.Google Scholar
Keller, T. and Eyerich, P.. PROST: Probabilistic planning based on UCT. In ICAPS, 2012.CrossRefGoogle Scholar
Keller, T. and Helmert, M.. Trial-based heuristic tree search for finite horizon MDPs. In ICAPS, 2013.CrossRefGoogle Scholar
Kerzner, H.. Project Management: A Systems Approach to Planning, Scheduling, and Controlling. Wiley, 2017.Google Scholar
Khan, M. A.-M., et al. A systematic review on reinforcement learning-based robotics within the last decade. IEEE Access, 2020.CrossRefGoogle Scholar
Khatib, L., et al. Temporal constraint reasoning with preferences. In IJCAI, 2001.Google Scholar
Khatib, O.. The potential field approach and operational space formulation in robot control. In Adaptive and Learning Systems: Theory and Applications. Springer, 1986.Google Scholar
Khatib, O.. Real-time obstacle avoidance for manipulators and mobile robots. IJRR, 1986.CrossRefGoogle Scholar
Kiesel, S. and Ruml, W.. Planning under temporal uncertainty using hindsight optimization. In ICAPS Wksh. on Planning and Robotics, 2014.Google Scholar
Kim, B. and Pineau, J.. Socially adaptive path planning in human environments using inverse reinforcement learning. Int. Jour. of Social Robotics, 2016.CrossRefGoogle Scholar
Kingma, D. P. and Ba, J.. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014.Google Scholar
Kingston, Z. K., et al. Sampling-based methods for motion planning with constraints. Annual Review of Control, Robotics, and Autonomous Systems, 2018.CrossRefGoogle Scholar
Kiran, B. R., et al. Deep reinforcement learning for autonomous driving: A survey. TITS, 2021.Google Scholar
Kissmann, P. and Edelkamp, S.. Solving fully-observable non-deterministic planning problems via translation into a general game. In KI. 2009.CrossRefGoogle Scholar
Klein, G. and Mouhoub, M.. Solving temporal constraints using neural networks. In Int. Conf. on Artificial Intelligence, 2002.Google Scholar
Klößner, T. and Hoffmann, J.. Pattern databases for stochastic shortest path problems. In SOCS, 2021.CrossRefGoogle Scholar
Klößner, T., et al. Pattern databases for goal-probability maximization in probabilistic planning. In ICAPS, 2021.CrossRefGoogle Scholar
Klößner, T., et al. Cost partitioning heuristics for stochastic shortest path problems. In ICAPS, 2022.CrossRefGoogle Scholar
Klößner, T., et al. Cartesian abstractions and saturated cost partitioning in probabilistic planning. In ECAI, 2023.CrossRefGoogle Scholar
Klößner, T., et al. A theory of mergeand-shrink for stochastic shortest path problems. In ICAPS, 2023.CrossRefGoogle Scholar
Knight, R., et al. Casper: Space exploration through continuous planning. IEEE Intelligent Systems, 2001.CrossRefGoogle Scholar
Knoblock, C. A.. Automatically generating abstractions for planning. AIJ, 1994.CrossRefGoogle Scholar
Knoblock, C. A. and Yang, Q.. Relating the performance of partial-order planning algorithms to domain features. SIGART Bulletin, 1995.CrossRefGoogle Scholar
Knuth, D. E. and Moore, R. W.. An analysis of alpha-beta pruning. AIJ, 1975.CrossRefGoogle Scholar
Kober, J. and Peters, J.. Policy search for motor primitives in robotics. ML, 2011.Google Scholar
Kober, J., et al. Reinforcement learning to adjust robot movements to new situations. In RSS, 2010.CrossRefGoogle Scholar
Kober, J., et al. Reinforcement learning in robotics: A survey. IJRR, 2013.CrossRefGoogle Scholar
Kocsis, L. and Szepesvári, C.. Bandit based Monte-Carlo planning. In ECML, 2006.CrossRefGoogle Scholar
Koehler, J.. Planning under resource constraints. In ECAI, 1998.Google Scholar
Koehler, J.. Handling of conditional effects and negative goals in IPP. Technical Report 128, Albert-Ludwigs-Universität Freiburg, 1999.Google Scholar
Koenig, S.. Minimax real-time heuristic search. AIJ, 2001.CrossRefGoogle Scholar
Koenig, S. and Simmons, R.. Solving robot navigation problems with initial pose uncertainty using real-time heuristic search. In AIPS, 1998.Google Scholar
Koller, D. and Friedman, N.. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.Google Scholar
Kolobov, A. and Weld, D.. ReTrASE: Integrating paradigms for approximate probabilistic planning. In IJCAI, 2009.Google Scholar
Kolobov, A., et al. SixthSense: Fast and reliable recognition of dead ends in MDPs. In AAAI, 2010.CrossRefGoogle Scholar
Kolobov, A., et al. Heuristic search for generalized stochastic shortest path MDPs. In ICAPS, 2011.CrossRefGoogle Scholar
Kolobov, A., et al. Reverse iterative deepening for finite-horizon MDPs with large branching factors. In ICAPS, 2012.CrossRefGoogle Scholar
Kolobov, A., et al. Stochastic shortest path MDPs with dead ends. ICAPS Wksh. on Heuristics and Search for Domain-independent Planning, 2012.Google Scholar
Konda, V. R. and Borkar, V. S.. Actorcritic–type learning algorithms for Markov decision processes. SIAM Jour. on Control and Optimization, 1999.CrossRefGoogle Scholar
Kong, Y. and Fu, Y.. Human action recognition and prediction: A survey. IJCV, 2022.CrossRefGoogle Scholar
Konidaris, G., et al. Robot learning from demonstration by constructing skill trees. IJRR, 2012.Google Scholar
Konidaris, G., et al. From skills to symbols: Learning symbolic representations for abstract high-level planning. JAIR, 2018.CrossRefGoogle Scholar
Konolige, K., et al. Navigation in hybrid metric-topological maps. In ICRA, 2011.CrossRefGoogle Scholar
Korf, R.. Real-time heuristic search. AIJ, 1990.CrossRefGoogle Scholar
Korf, R. E.. Depth-first iterativedeepening: An optimal admissible tree search. AIJ, 1985.Google Scholar
Korf, R. E.. Planning as search: A quantitative approach. AIJ, 1987.CrossRefGoogle Scholar
Korf, R. E.. Linear-space best-first search. AIJ, 1993.CrossRefGoogle Scholar
Kormushev, P., et al. Reinforcement learning in robotics: Applications and real-world challenges. Robotics, 2013.CrossRefGoogle Scholar
Koubarakis, M.. From local to global consistency in temporal constraint networks. TCS, 1997.CrossRefGoogle Scholar
Krizhevsky, A., et al. Imagenet classification with deep convolutional neural networks. CACM, 2017.CrossRefGoogle Scholar
Kroemer, O., et al. A review of robot learning for manipulation: Challenges, representations, and algorithms. JMLR, 2021.Google Scholar
Krüger, V., et al. The meaning of action: A review on action recognition and mapping. Advanced Robotics, 2007.CrossRefGoogle Scholar
Kuffner, J. J. and LaValle, S. M.. RRTconnect: An efficient approach to singlequery path planning. In ICRA, 2000.Google Scholar
Kuindersma, S., et al. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot. Autonomous Robots, 2016.CrossRefGoogle Scholar
Kuipers, B. and Byun, Y.-T.. A robot exploration and mapping strategy based on a semantic hierarchy of spatial representations. RAS, 1991.CrossRefGoogle Scholar
Kuipers, B., et al. Local metrical and global topological maps in the hybrid spatial semantic hierarchy. In ICRA, 2004.CrossRefGoogle Scholar
Kulkarni, T. D., et al. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In NeurIPS, 2016.Google Scholar
Kumar, S.. Balancing a cartpole system with reinforcement learning–a tutorial. arXiv:2006.04938, 2020.Google Scholar
Kumar, V. and Kanal, L.. A general branch and bound formulation for understanding and synthesizing and/or tree search procedures. AIJ, Mar. 1983.CrossRefGoogle Scholar
Kupferman, O. and Vardi, M. Y.. Synthesizing distributed systems. In IEEE Symposium on Logic in Computer Science, 2001.Google Scholar
Kupferman, O., et al. Open systems in reactive environments: Control and synthesis. In Int. Conf. on Concurrency Theory, 2000.CrossRefGoogle Scholar
Kurutach, H., et al. Learning plannable representations with causal infogan. In NeurIPS, 2018.Google Scholar
Kuter, U. and Nau, D.. Using domainconfigurable search control for probabilistic planning. In AAAI, 2005.Google Scholar
Kuter, U., et al. Using classical planners to solve nondeterministic planning problems. In ICAPS, Sept. 2008.Google Scholar
Kuter, U., et al. Task decomposition on abstract states, for planning under nondeterminism. AIJ, 2009.CrossRefGoogle Scholar
Kvarnström, J. and Doherty, P.. TALplanner: A temporal logic based forward chaining planner. AMAI, 2001.Google Scholar
Kveton, B., et al. Solving factored MDPs with hybrid state and action variables. JAIR, 2006.CrossRefGoogle Scholar
Kwon, M., et al. Reward design with language models. arXiv:2303.00001, 2023.Google Scholar
Laborie, P.. Algorithms for propagating resource constraints in AI planning and scheduling: Existing approaches and new results. AIJ, 2003.CrossRefGoogle Scholar
Laborie, P. and Ghallab, M.. Planning with sharable resource constraints. In IJCAI, 1995.Google Scholar
Lagoudakis, M. G. and Parr, R.. Model-free least-squares policy iteration. In NeurIPS, 2001.Google Scholar
Lagoudakis, M. G. and Parr, R.. Reinforcement learning as classification: Leveraging modern classifiers. In ICML, 2003.Google Scholar
Lagriffoul, F. and Andres, B.. Combining task and motion planning: A culprit detection problem. IJRR, 2016.CrossRefGoogle Scholar
Lagriffoul, F., et al. Combining task and motion planning is not always a good idea. In RSS Wksh. on Combined Robot Motion Planning and AI Planning for Practical Applications, 2013.Google Scholar
Lagriffoul, F., et al. Efficiently combining task and motion planning using geometric constraints. IJRR, 2014.CrossRefGoogle Scholar
Laird, J., et al. Universal Subgoaling and Chunking: The Automatic Generation and Learning of Goal Hierarchies. Springer Science & Business Media, 2012.Google Scholar
Lamanna, L. and Serafini, L.. Action model learning from noisy traces: A probabilistic approach. In ICAPS, 2024.CrossRefGoogle Scholar
Lamanna, L., et al. On-line learning of planning domains from sensor data in pal: Scaling up to large state spaces. In AAAI, 2021.CrossRefGoogle Scholar
Lamanna, L., et al. Online learning of action models for PDDL planning. In IJCAI, 2021.CrossRefGoogle Scholar
Lamanna, L., et al. Online grounding of symbolic planning domains in unknown environments. In KR, 2022.CrossRefGoogle Scholar
Lamanna, L., et al. Learning to act for perceiving in partially unknown environments. In IJCAI, 2023.CrossRefGoogle Scholar
Lamanna, L., et al. Planning for learning object properties. In AAAI, 2023.CrossRefGoogle Scholar
Lan, M., et al. A modular mission management system for micro aerial vehicles. In IEEE Int. Conf. on Control and Automation, 2018.CrossRefGoogle Scholar
Lange, S., et al. Batch reinforcement learning. In Reinforcement learning: State-of-the-art. Springer, 2012.CrossRefGoogle Scholar
Langley, P.. Learning hierarchical problem networks for knowledge-based planning. In Int. Conf. on Inductive Logic Programming, 2022.Google Scholar
Langley, P. and Choi, D.. Learning recursive control programs from problem solving. JMLR, 2006.Google Scholar
Langley, P., et al. Hierarchical problem networks for knowledge-based planning. In Annual Conf. on Advances in Cognitive Systems, 2021.Google Scholar
Laporte, C. and Arbel, T.. Efficient discriminant viewpoint selection for active Bayesian recognition. IJRR, 2006.CrossRefGoogle Scholar
Latombe, J.-C.. Robot Motion Planning. Kluwer, Boston, MA, 1991.CrossRefGoogle Scholar
LaValle, S. M.. Planning Algorithms. Cambridge University Press, 2006.CrossRefGoogle Scholar
LaValle, S. M. and Kuffner, J. J. Jr. Randomized kinodynamic planning. IJRR, 2001.CrossRefGoogle Scholar
Lazaric, A.. Transfer in reinforcement learning: A framework and a survey. In Reinforcement Learning: State-of-the-Art. Springer, 2012.Google Scholar
Lazaridis, A., et al. Deep reinforcement learning: A state-of-the-art walkthrough. JAIR, 2020.CrossRefGoogle Scholar
Le Guillou, X., et al. Chronicles for on-line diagnosis of distributed systems. In ECAI, 2008.CrossRefGoogle Scholar
Lee, H., et al. RLAIF: Scaling reinforcement learning from human feedback with AI feedback. arXiv:2309.00267, 2023.Google Scholar
Lee, J., et al. Learning quadrupedal locomotion over challenging terrain. Science Robotics, 2020.CrossRefGoogle Scholar
Lee, J. B., et al. Temporal network representation learning. arXiv:1904.06449, 2019.Google Scholar
Lee, M. and Anderson, C. W.. Can a reinforcement learning agent practice before it starts learning? In IJCNN, 2017.CrossRefGoogle Scholar
Lemai-Chenevier, S. and Ingrand, F.. Interleaving temporal planning and execution in robotics domains. In AAAI, 2004.Google Scholar
León, B., et al. Opengrasp: A toolkit for robot grasping simulation. In Int. Conf. on Simulation, Modeling, and Programming for Autonomous Robots, 2010.CrossRefGoogle Scholar
Lesire, C. and Pommereau, F.. ASPiC: An acting system based on skill Petri net composition. In IROS, 2018.CrossRefGoogle Scholar
Lesser, V., et al. Evolution of the gpgp/tæms domain-independent coordination framework. JAAMAS, 2004.CrossRefGoogle Scholar
Levesque, H., et al. GOLOG: A logic programming language for dynamic domains. Jour. of Logic Programming, 1997.CrossRefGoogle Scholar
Levine, S.. Reinforcement learning and control as probabilistic inference: Tutorial and review. arXiv:1805.00909, 2018.Google Scholar
Levine, S. and Koltun, V.. Guided policy search: Deep RL with importance sampled policy gradient. In ICML, 2013.Google Scholar
Levine, S., et al. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv:2005.01643, 2020.Google Scholar
Levine, S. J. and Williams, B. C.. Concurrent plan recognition and execution for human-robot teams. In ICAPS, Nov. 2014.CrossRefGoogle Scholar
Li, B., et al. Language-driven semantic segmentation. arXiv:2201.03546, 2022.Google Scholar
Li, C., et al. Multimodal foundation models: From specialists to general-purpose assistants. arXiv:2309.10020, 2023.Google Scholar
Li, H. X. and Williams, B. C.. Generative planning for hybrid systems based on flow tubes. In ICAPS, 2008.Google Scholar
Li, J., et al. Scalable rail planning and replanning: Winning the 2020 Flatland Challenge. In ICAPS, 2021.CrossRefGoogle Scholar
Li, K., et al. Emergent world representations: Exploring a sequence model trained on a synthetic task. arXiv:2210.13382, 2022.Google Scholar
Li, L. and Littman, M. L.. Lazy approximation for solving continuous finite-horizon MDPs. In AAAI, 2005.Google Scholar
Li, M., et al. API-bank: A comprehensive benchmark for tool-augmented LLMs. arXiv:2304.08244, 2023.Google Scholar
Li, R.. Automating hierarchical task network learning. PhD thesis, Univ. of Maryland, June 2024.Google Scholar
Li, Z., et al. Reinforcement learning for robust parameterized locomotion control of bipedal robots. In ICRA, 2021.CrossRefGoogle Scholar
Liang, J., et al. Code as policies: Language model programs for embodied control. In ICRA, 2023.CrossRefGoogle Scholar
Liatsos, V. and Richard, B.. Scalability in planning. In ECP, 1999.Google Scholar
Liberman, A. O., et al. Learning first-order symbolic planning representations that are grounded. arXiv:2204.11902, 2022.Google Scholar
Lifschitz, V.. On the semantics of STRIPS. In Georgeff, M. P. and Lansky, A. L., editors, Reasoning about Actions and Plans. Morgan Kaufmann, 1987.Google Scholar
Ligozat, G.. On generalized interval calculi. In AAAI, 1991.Google Scholar
Likhachev, M., et al. Planning for Markov decision processes with sparse stochasticity. In NeurIPS, 2004.Google Scholar
Lillicrap, T. P., et al. Continuous control with deep reinforcement learning. arXiv:1509.02971, 2016.Google Scholar
Lim, M. H., et al. Sparse tree search optimality guarantees in POMDPs with continuous observation spaces. arXiv:1910.04332, 2019.Google Scholar
Lin, K., et al. Text2Motion: From natural language instructions to feasible plans. arXiv:2303.12153, 2023.Google Scholar
Lin, S.. Computer solutions of the traveling salesman problem. Bell System Technical Jour., 1965.CrossRefGoogle Scholar
Lipovetzky, N. and Geffner, H.. Best-first width search: Exploration and exploitation in classical planning. In AAAI, 2017.CrossRefGoogle Scholar
Little, I. and Thiébaux, S.. Probabilistic planning vs. replanning. In ICAPS Wksh. on the Int. Planning Competition, 2007.Google Scholar
Little, I., et al. Prottle: A probabilistic temporal planner. In AAAI, 2005.Google Scholar
Littman, M. L., et al. Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100). Technical report, Stanford Univ., 2021.Google Scholar
Liu, B., et al. LLM+P: Empowering large language models with optimal planning proficiency. arXiv:2304.11477, 2023.Google Scholar
Liu, H., et al. Darts: Differentiable architecture search. arXiv:1806.09055, 2018.Google Scholar
Liu, Y. and Koenig, S.. Functional value iteration for decision-theoretic planning with general utility functions. In AAAI, 2006.Google Scholar
Liu, Y. and Koenig, S.. Functional value iteration for decision-theoretic planning with general utility functions. In AAAI, 2006.Google Scholar
Liu, Y., et al. Learning search-space specific heuristics using neural networks. arXiv:2306.04019, 2023.Google Scholar
Lluvia, I., et al. Active mapping and robot exploration: A survey. Sensors, 2021.CrossRefGoogle Scholar
Long, D. and Fox, M.. Efficient implementation of the plan graph in STAN. JAIR, 1999.CrossRefGoogle Scholar
Long, D. and Fox, M.. Exploiting a graphplan framework in temporal planning. In ICAPS, 2003.Google Scholar
Long, D. and Fox, M.. The 3rd international planning competition: Results and analysis. JAIR, 2003.CrossRefGoogle Scholar
Long, D., et al. The AIPS-98 planning competition. AIMag, 2000.Google Scholar
Lotem, A. and Nau, D. S.. New advances in GraphHTN: Identifying independent subproblems in large HTN domains. In AIPS, 2000.Google Scholar
Lotem, A., et al. Using planning graphs for solving HTN problems. In AAAI, 1999.Google Scholar
Lotinac, D. and Jonsson, A.. Constructing hierarchical task models using invariance analysis. In ECAI. IOS Press, 2016.Google Scholar
Lozano-Pérez, T. and Kaelbling, L. P.. A constraint-based method for solving sequential manipulation planning problems. In IROS, 2013.CrossRefGoogle Scholar
Lozano-Perez, T. and Wesley, M. A.. An algorithm for planning collision-free paths among polyhedral obstacles. CACM, Oct. 1979.CrossRefGoogle Scholar
Luketina, J., et al. A survey of reinforcement learning informed by natural language. arXiv:1906.03926, 2019.Google Scholar
Ly, L. and Tsai, Y.-H. R.. Autonomous exploration, reconstruction, and surveillance of 3D environments aided by deep learning. In ICRA, 2019.CrossRefGoogle Scholar
Lynch, K. M. and Park, F. C.. Modern Robotics: Mechanics, Planning, and Control. Cambridge University Press, 2017.CrossRefGoogle Scholar
Ma, Y. J., et al. Eureka: Human-level reward design via coding large language models. arXiv:2310.12931, 2023.Google Scholar
MacGlashan, J., et al. Interactive learning from policy-dependent human feedback. ML, 2017.Google Scholar
Madden, M. G. and Howley, T.. Transfer of experience between reinforcement learning environments with progressive difficulty. Artificial Intelligence Review, 2004.CrossRefGoogle Scholar
Magnaguagno, M. C., et al. HyperTensioN and total-order forward decomposition optimizations. arXiv:2207.00345, 2022.Google Scholar
Mahadevan, S. and Connell, J.. Automatic programming of behavior-based robots using reinforcement learning. AIJ, 1992.CrossRefGoogle Scholar
Maliah, S., et al. Partially observable online contingent planning using landmark heuristics. In ICAPS, 2014.CrossRefGoogle Scholar
Malik, J. and Binford, T.. Reasoning in time and space. In IJCAI, 1983.Google Scholar
Mallick, P., et al. Reinforcement learning using expectation maximization based guided policy search for stochastic dynamics. Neurocomputing, 2022.CrossRefGoogle Scholar
Mansouri, M. and Pecora, F.. A robot sets a table: A case for hybrid reasoning with different types of knowledge. JETAI, 2016.CrossRefGoogle Scholar
Marecki, J., et al. A fast analytical algorithm for MDPs with continuous state spaces. In AAMAS Wksh. on Game Theoretic and Decision Theoretic Agents, 2006.Google Scholar
Marthi, B., et al. Angelic semantics for high-level actions. In ICAPS, 2007.Google Scholar
Marthi, B., et al. Angelic hierarchical planning: Optimal and online algorithms. In ICAPS, 2008.Google Scholar
Marthi, B. M., et al. Concurrent hierarchical reinforcement learning. In AAAI, 2005.Google Scholar
Marzinotto, A., et al. Towards a unified behavior trees framework for robot control. In ICRA, 2014.CrossRefGoogle Scholar
Maslej, N., et al. The AI index annual report. Technical report, Stanford Univ., 2024.Google Scholar
Mason, M. T.. Mechanics of Robotic Manipulation. MIT press, 2001.CrossRefGoogle Scholar
Mason, M. T.. Toward robotic manipulation. ARCRAS, 2018.CrossRefGoogle Scholar
Mason, M. T. and Salisbury, J. K. Jr. Robot Hands and the Mechanics of Manipulation. MIT Press, 1985.Google Scholar
Mateas, M. and Stern, A.. A behavior language for story-based believable agents. IEEE Intelligent Systems, 2002.CrossRefGoogle Scholar
Mattmüller, R., et al. Pattern database heuristics for fully observable nondeterministic planning. In ICAPS, 2010.CrossRefGoogle Scholar
Mausam and A. Kolobov. Planning with Markov Decision Processes: An AI Perspective. Morgan & Claypool, 2012.Google Scholar
Mausam and D. Weld. Concurrent probabilistic temporal planning. In ICAPS, 2005.Google Scholar
Mausam and D. Weld. Probabilistic temporal planning with uncertain durations. In AAAI, 2006.Google Scholar
Mausam and D. Weld. Planning with durative actions in stochastic domains. JAIR, 2008.CrossRefGoogle Scholar
Mausam, et al. A hybridized planner for stochastic domains. In IJCAI, 2007.Google Scholar
McAleer, S., et al. Solving the Rubik’s cube without human knowledge. arXiv:1805.07470, 2018.Google Scholar
McAllester, D. and Rosenblitt, D.. Systematic nonlinear planning. In AAAI, July 1991.Google Scholar
McCarthy, J. and Hayes, P. J.. Some philosophical problems from the standpoint of artificial intelligence. In Machine Intelligence. Edinburgh Univ. Press, 1969.Google Scholar
McCulloch, W. S. and Pitts, W.. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 1943.CrossRefGoogle Scholar
McDermott, D.. A temporal logic for reasoning about processes and plans. Cognitive Science, 1982.CrossRefGoogle Scholar
McDermott, D.. A reactive plan language. Technical report, Yale Univ., CSD/RR 864, 1991.Google Scholar
McDonald, J., et al. Great power, great responsibility: Recommendations for reducing energy for training language models. arXiv:2205.09646, 2022.Google Scholar
McGann, C., et al. A deliberative architecture for AUV control. In ICRA, 2008.CrossRefGoogle Scholar
McIlraith, S. A. and Son, T. C.. Adapting GOLOG for composition of semantic web services. In KR, 2002.Google Scholar
McMahan, H. B. and Gordon, G. J.. Fast exact planning in Markov decision processes. In ICAPS, 2005.Google Scholar
McMahon, J. and Plaku, E.. Robot motion planning with task specifications via regular languages. Robotica, 2017.Google Scholar
Med, J., et al. Weak and strong reversibility of non-deterministic actions: Universality and uniformity. In ICAPS, 2024.CrossRefGoogle Scholar
Mehta, N., et al. Automatic induction of maxq hierarchies. In NIPS Wksh.: Hierarchical Organization of Behavior, 2007.Google Scholar
Meiri, I.. Faster constraint satisfaction algorithms for temporal reasoning. Tech. report R-151, UC Los Angeles, 1990.Google Scholar
Mendonça, M. R., et al. Graph-based skill acquisition for reinforcement learning. CSUR, 2019.CrossRefGoogle Scholar
Menif, A., et al. SHPE: HTN planning for video games. In Wksh. on Computer Games, 2014.CrossRefGoogle Scholar
Merrill, W., et al. Provable limitations of acquiring meaning from ungrounded form: What will future language models understand? Trans. of the Association for Computational Linguistics, 2021.CrossRefGoogle Scholar
Meuleau, N. and Brafman, R. I.. Hierarchical heuristic forward search in stochastic domains. In IJCAI, 2007.Google Scholar
Meuleau, N., et al. A heuristic search approach to planning with continuous resources in stochastic domains. JAIR, 2009.CrossRefGoogle Scholar
Meuleau, N., et al. A heuristic search approach to planning with continuous resources in stochastic domains. JAIR, 2009.CrossRefGoogle Scholar
Michel, O.. Cyberbotics Ltd. Webots: Professional mobile robot simulation. Int. Jour. of Advanced Robotic Systems, 2004.CrossRefGoogle Scholar
Micheli, A. and Valentini, A.. Synthesis of search heuristics for temporal planning via reinforcement learning. In AAAI, 2021.CrossRefGoogle Scholar
Miguel, I., et al. Flexible graphplan. In ECAI, 2000.Google Scholar
Minguez, J., et al. Motion planning and obstacle avoidance. In Handbook of Robotics. Springer, 2008.Google Scholar
Minton, S., et al. Commitment strategies in planning: A comparative analysis. In IJCAI, 1991.Google Scholar
Minton, S., et al. Total order vs. partial order planning: Factors influencing performance. In KR, 1992.Google Scholar
Mirchandani, S., et al. Ella: Exploration through learned language abstraction. In NeurIPS, 2021.Google Scholar
Miyamae, A., et al. Natural policy gradient methods with parameter-based exploration for control tasks. In NeurIPS, 2010.Google Scholar
Mnih, V., et al. Human-level control through deep reinforcement learning. Nature, 2015.CrossRefGoogle Scholar
Moerland, T. M., et al. A framework for reinforcement learning and planning. arXiv:2006.15009, 2020.Google Scholar
Moerland, T. M., et al. Model-based reinforcement learning: A survey. Foundations and Trends in Machine Learning, 2023.CrossRefGoogle Scholar
Moeslund, T. B., et al. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 2006.CrossRefGoogle Scholar
Moffitt, M. D.. On the modelling and optimization of preferences in constraintbased temporal reasoning. AIJ, 2011.Google Scholar
Moffitt, M. D. and Pollack, M. E.. Partial constraint satisfaction of disjunctive temporal problems. In FLAIRS, 2005.Google Scholar
Mohanan, M. and Salgoankar, A.. A survey of robotic motion planning in dynamic environments. RAS, 2018.CrossRefGoogle Scholar
Molineaux, M., et al. Goal-driven autonomy in a Navy strategy simulation. In AAAI, 2010.CrossRefGoogle Scholar
Montemerlo, M., et al. FastSLAM: A factored solution to the simultaneous localization and mapping problem. In AAAI, 2002.Google Scholar
Moore, A. W. and Atkeson, C. G.. Prioritized sweeping: Reinforcement learning with less data and less time. ML, 1993.CrossRefGoogle Scholar
Mordoch, A., et al. Collaborative multiagent planning with black-box agents by learning action models. In ICAPS Wksh. on on Reliable Data-Driven Planning and Scheduling (RDDPS), 2022.Google Scholar
Mordoch, A., et al. Learning safe numeric action models. In AAAI, 2023.CrossRefGoogle Scholar
Morisset, B. and Ghallab, M.. Learning how to combine sensory-motor functions into a robust behavior. AIJ, 2008.CrossRefGoogle Scholar
Morris, P., et al. Dynamic control of plans with temporal uncertainty. In IJCAI, 2001.Google Scholar
Morris, P. H.. Dynamic controllability and dispatchability relationships. In Integration of AI and OR Techniques in Constraint Programming. Springer, 2014.Google Scholar
Morris, P. H. and Muscettola, N.. Temporal dynamic controllability revisited. In AAAI, 2005.Google Scholar
Morrison, D. R., et al. Branch-and-bound algorithms: A survey of recent advances in searching, branching, and pruning. Discrete Optimization, 2016.CrossRefGoogle Scholar
Mourão, K., et al. Learning STRIPS operators from noisy and incomplete observations. In UAI, 2012.Google Scholar
Mourtzis, D., et al. Simulation in manufacturing: Review and challenges. Procedia CIRP, 2014.CrossRefGoogle Scholar
Muise, C., et al. Non-deterministic planning with conditional effects. In ICAPS, 2014.CrossRefGoogle Scholar
Muise, C., et al. PRP rebooted: Advancing the state of the art in FOND planning. In AAAI, 2024.CrossRefGoogle Scholar
Muise, C. J., et al. Improved nondeterministic planning by exploiting state relevance. In ICAPS, 2012.CrossRefGoogle Scholar
Munos, R. and Moore, A. W.. Variable resolution discretization in optimal control. ML, 2002.Google Scholar
Munoz-Avila, H. and Cox, M. T.. Casebased plan adaptation: An analysis and review. IEEE Intelligent Systems, 2008.CrossRefGoogle Scholar
Muñoz-Avila, H., et al. SiN: Integrating case-based reasoning with task decomposition. In IJCAI, 2001.Google Scholar
Muscettola, N., et al. Reformulating temporal plans for efficient execution. In KR, 1998.Google Scholar
Muscettola, N., et al. Remote Agent: To boldly go where no AI system has gone before. AIJ, 1998.CrossRefGoogle Scholar
Muscettola, N., et al. IDEA: Planning at the core of autonomous reactive agents. In IWPSS, 2002.Google Scholar
Musliner, D. J., et al. The evolution of CIRCA, a theory-based AI architecture with real-time performance guarantees. In AAAI Spring Symposium: Emotion, Personality, and Social Behavior, 2008.Google Scholar
Myers, K. L.. CPEF: A continuous planning and execution framework. AIMag, 1999.CrossRefGoogle Scholar
Nachum, O., et al. Data-efficient hierarchical reinforcement learning. In NeurIPS, 2018.Google Scholar
Najar, A. and Chetouani, M.. Reinforcement learning with human advice: A survey. Frontiers in Robotics and AI, 2021.CrossRefGoogle Scholar
Nareyek, A., et al. Constraints and AI planning. IEEE Intelligent Systems, 2005.CrossRefGoogle Scholar
Nau, D., et al. GTPyhop: A hierarchical goal+task planner implemented in Python. In HPlan, July 2021.Google Scholar
Nau, D. S., et al. General branch and bound, and its relation to A* and AO*. AIJ, 1984.CrossRefGoogle Scholar
Nau, D. S., et al. SHOP: Simple hierarchical ordered planner. In IJCAI, 1999.Google Scholar
Nau, D. S., et al. Total-order planning with partially ordered subtasks. In IJCAI, 2001.Google Scholar
Naveed, H., et al. A comprehensive overview of large language models. arXiv:2307.06435, 2023.Google Scholar
Nebel, B. and Burckert, H.. Reasoning about temporal relations: A maximal tractable subclass of Allen’s interval algebra. JACM, 1995.CrossRefGoogle Scholar
Nebel, B. and Koehler, J.. Plan reuse versus plan generation: A theoretical and empirical analysis. AIJ, July 1995.CrossRefGoogle Scholar
Nedunuri, S., et al. SMT-based synthesis of integrated task and motion plans from plan outlines. In ICRA, 2014.CrossRefGoogle Scholar
Neu, G. and Szepesvári, C.. Training parsers by inverse reinforcement learning. ML, 2009.CrossRefGoogle Scholar
Neu, G. and Szepesvári, C.. Apprenticeship learning using inverse reinforcement learning and gradient methods. arXiv:1206.5264, 2012.Google Scholar
Neufeld, X., et al. Building a planner: A survey of planning systems used in commercial video games. TG, 2017.Google Scholar
Newcombe, R. A. and Davison, A. J.. Live dense reconstruction with a single moving camera. In CVPR, 2010.CrossRefGoogle Scholar
Newell, A. and Ernst, G.. The search for generality. In IFIP Congress, 1965.Google Scholar
Newell, A. and Simon, H. A.. GPS, a program that simulates human thought. In Computers and Thought. McGraw-Hill, 1963.Google Scholar
Newton, M. A. H., et al. Learning macro-actions for arbitrary planners and domains. In ICAPS, 2007.Google Scholar
Nez-Carranza, J. M. and Calway, A.. Unifying planar and point mapping in monocular SLAM. In British Machine Vision Conf., 2010.Google Scholar
Ng, A. and Jordan, M.. PEGASUS: A policy search method for large MDPs and POMDPs. In UAI, 2000.Google Scholar
Ng, A. Y., et al. Policy invariance under reward transformations: Theory and application to reward shaping. In ICML, 1999.Google Scholar
Ng, A. Y., et al. Algorithms for inverse reinforcement learning. In ICML, 2000.Google Scholar
Nguyen, N. and Kambhampati, S.. Reviving partial order planning. In IJCAI, 2001.Google Scholar
Nicolin, A., et al. Agimus: A new framework for mapping manipulation motion plans to sequences of hierarchical task-based controllers. In IEEE Int. Symposium on System Integration, 2020.CrossRefGoogle Scholar
Niekum, S.. An integrated system for learning multi-step robotic tasks from unstructured demonstrations. In AAAI Spring Symposium, 2013.Google Scholar
Nieuwenhuis, R., et al. Solving SAT and SAT modulo theories: From an abstract Davis-Putnam-Logemann-Loveland procedure to DPLL (T). JACM, 2006.CrossRefGoogle Scholar
Nikolova, E. and Karger, D. R.. Route planning under uncertainty: The Canadian traveller problem. In AAAI, 2008.Google Scholar
Nilsson, M., et al. EfficientIDC: A faster incremental dynamic controllability algorithm. In ICAPS, 2014.CrossRefGoogle Scholar
Nilsson, M., et al. Incremental dynamic controllability in cubic worst-case time. In TIME, 2014.CrossRefGoogle Scholar
Nilsson, N.. Principles of Artificial Intelligence. Morgan Kaufmann, 1980.Google Scholar
Niu, Y., et al. ATP: Enabling fast LLM serving via attention on top principal keys. arXiv:2403.02352, 2024.Google Scholar
Ong, S. C. W., et al. Planning under uncertainty for robotic tasks with mixed observability. IJRR, 2010.CrossRefGoogle Scholar
OpenAI. GPT-4. Technical report, OpenAI, 2023.Google Scholar
Oswald, J., et al. Large language models as planning domain generators. In ICAPS, 2024.CrossRefGoogle Scholar
Ouyang, L., et al. Training language models to follow instructions with human feedback. arXiv:2203.02155, 2022.Google Scholar
Özalp, R., et al. A review of deep reinforcement learning algorithms and comparative results on inverted pendulum system. Machine Learning Paradigms, 2020.CrossRefGoogle Scholar
Paden, B., et al. A survey of motion planning and control techniques for self-driving urban vehicles. TIV, 2016.CrossRefGoogle Scholar
Pallagani, V., et al. Understanding the capabilities of large language models for automated planning. arXiv:2305.16151, 2023.Google Scholar
Pallagani, V., et al. Plansformer tool: Demonstrating generation of symbolic plans using transformers. In IJCAI, 2023.CrossRefGoogle Scholar
Pallagani, V., et al. On the prospects of incorporating large language models (LLMs) in automated planning and scheduling (APS). In ICAPS, 2024.CrossRefGoogle Scholar
Pan, X., et al. Virtual to real reinforcement learning for autonomous driving. arXiv:1704.03952, 2017.Google Scholar
Parisi, A., et al. TALM: Tool augmented language models. arXiv:2205.12255, 2022.Google Scholar
Parisi, G. I., et al. Continual lifelong learning with neural networks: A review. Neural Networks, 2019.CrossRefGoogle Scholar
Parr, R. and Russell, S. J.. Reinforcement learning with hierarchies of machines. In NeurIPS, 1998.Google Scholar
Pateria, S., et al. Hierarchical reinforcement learning: A comprehensive survey. CSUR, 2021.CrossRefGoogle Scholar
Patra, S., et al. Deliberative acting, planning and learning with hierarchical operational models. AIJ, 2021.CrossRefGoogle Scholar
Patra, S., et al. Using online planning and acting to recover from cyberattacks on software-defined networks. In AAAI, 2021.CrossRefGoogle Scholar
Patterson, D., et al. Carbon emissions and large neural network training. arXiv:2104.10350, 2021.Google Scholar
Patterson, D., et al. The carbon footprint of machine learning training will plateau, then shrink. IEEE Computer, 2022.CrossRefGoogle Scholar
Paxton, C., et al. CoSTAR: Instructing collaborative robots with behavior trees and vision. In ICRA, 2017.CrossRefGoogle Scholar
Pearl, J.. Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, 1984.Google Scholar
Pecora, F., et al. A constraint-based approach for proactive, context-aware human support. Jour. of Ambient Intelligence and Smart Environments, 2012.CrossRefGoogle Scholar
Pednault, E.. Synthesizing plans that contain actions with context-dependent effects. CI, 1988.CrossRefGoogle Scholar
Pednault, E. P.. ADL: Exploring the middle ground between STRIPS and the situation calculus. In KR, 1989.Google Scholar
Pellier, D., et al. HDDL 2.1: Towards defining a formalism and a semantics for temporal HTN planning. arXiv:2306.07353, 2023.Google Scholar
Penberthy, J. and Weld, D. S.. Temporal planning with continuous change. In AAAI, 1994.Google Scholar
Penberthy, J. S. and Weld, D.. UCPOP: A sound, complete, partial order planner for ADL. In KR, 1992.Google Scholar
Peng, S., et al. Openscene: 3D scene understanding with open vocabularies. In CVPR, 2023.CrossRefGoogle Scholar
Penna, G. D., et al. Upmurphi: A tool for universal planning on PDDL+ problems. In ICAPS, 2009.Google Scholar
Peot, M. and Smith, D.. Conditional nonlinear planning. In AIPS, 1992.CrossRefGoogle Scholar
Pereira, R. F., et al. Iterative depth-first search for FOND planning. In ICAPS, 2022.Google Scholar
Péron, M., et al. Fast-tracking stationary MOMDPs for adaptive management problems. AAAI, 2017.CrossRefGoogle Scholar
Peters, J. and Schaal, S.. Reinforcement learning by reward-weighted regression for operational space control. In ICML, 2007.CrossRefGoogle Scholar
Peters, J., et al. Natural actor-critic. In ECML, 2005.CrossRefGoogle Scholar
Peters, J., et al. Towards robot skill learning: From simple skills to table tennis. In European Conf. Machine Learning and Knowledge Discovery in Databases, 2013.CrossRefGoogle Scholar
Peterson, J. L.. Petri nets. CSUR, 1977.CrossRefGoogle Scholar
Petri, C. A.. Communication with automata. PhD thesis, Institut für Instrumentelle Mathematik, Bonn, 1962.Google Scholar
Petrick, R. and Bacchus, F.. Extending the knowledge-based approach to planning with incomplete information and sensing. In ICAPS, 2004.Google Scholar
Pettersson, O.. Execution monitoring in robotics: A survey. RAS, 2005.CrossRefGoogle Scholar
Pham, Q.-C., et al. Kinodynamic planning in the configuration space via admissible velocity propagation. In RSS, 2013.CrossRefGoogle Scholar
Pineau, J., et al. Policy-contingent abstraction for robust robot control. In UAI, 2002.Google Scholar
Pineau, J., et al. Towards robotic assistants in nursing homes: Challenges and results. RAS, Mar. 2003.CrossRefGoogle Scholar
Pistore, M. and Traverso, P.. Planning as model checking for extended goals in nondeterministic domains. In IJCAI, 2001.Google Scholar
Pistore, M. and Traverso, P.. Assumptionbased composition and monitoring of web services. In Test and Analysis of Web Services. 2007.Google Scholar
Pistore, M., et al. Symbolic techniques for planning with extended goals in nondeterministic domains. In ECP, 2001.Google Scholar
Pistore, M., et al. Automated composition of web services by planning in asynchronous domains. In ICAPS, 2005.Google Scholar
Pistore, M., et al. A minimalist approach to semantic annotations for web processes compositions. In European Semantic Web Conf., 2006.CrossRefGoogle Scholar
Planken, L. R.. Incrementally solving the STP by enforcing partial path consistency. In Wksh. of the UK Planning and Scheduling Special Interest Group, 2008.Google Scholar
Pnueli, A. and Rosner, R.. On the synthesis of a reactive module. In POPL, 1989.CrossRefGoogle Scholar
Pnueli, A. and Rosner, R.. On the synthesis of an asynchronous reactive module. In Int. Colloquium on Automata, Languages and Programming, 1989.CrossRefGoogle Scholar
Pnueli, A. and Rosner, R.. Distributed reactive systems are hard to synthesize. In Annual Symposium on Foundations of Computer Science, 1990.Google Scholar
Pohl, I.. Heuristic search viewed as path finding in a graph. AIJ, 1970.CrossRefGoogle Scholar
Pollack, M. E. and Horty, J. F.. There’s more to life than making plans: Plan management in dynamic, multiagent environments. AIMag, 1999.Google Scholar
Polydoros, A. S. and Nalpantidis, L.. Survey of model-based reinforcement learning: Applications on robotics. JIRS, 2017.CrossRefGoogle Scholar
Pommereau, F.. Algebras of Coloured Petri Nets. Lambert Academic Publishing, 2010.Google Scholar
Pommerening, F., et al. From non-negative to general operator cost partitioning. In AAAI, 2015.CrossRefGoogle Scholar
Porteous, J., et al. On the extraction, ordering, and usage of landmarks in planning. In ECP, 2001.Google Scholar
Powell, J., et al. Active and interactive discovery of goal selection knowledge. In FLAIRS, 2011.Google Scholar
Prassler, E. and Kosuge, K.. Domestic robotics. In Handbook of Robotics. Springer, 2008.Google Scholar
Prentice, S. and Roy, N.. The belief roadmap: Efficient planning in belief space by factoring the covariance. IJRR, 2009.CrossRefGoogle Scholar
Pryor, L. and Collins, G.. Planning for contingency: A decision based approach. JAIR, 1996.CrossRefGoogle Scholar
Pternea, M., et al. The RL/LLM taxonomy tree: Reviewing synergies between reinforcement learning and large language models. JAIR, 2024.CrossRefGoogle Scholar
Puterman, M. L.. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, 1994.CrossRefGoogle Scholar
Py, F., et al. A systematic agent framework for situated autonomous systems. In AAMAS, 2010.Google Scholar
Pynadath, D. V. and Wellman, M. P.. Probabilistic state-dependent grammars for plan recognition. In UAI, 2000.Google Scholar
Qi, C. R., et al. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In NeurIPS, 2017.Google Scholar
Qiu, W. and Zhu, H.. Programmatic reinforcement learning without oracles. In ICLR, 2021.Google Scholar
Quartey, B., et al. Exploiting contextual structure to generate useful auxiliary tasks. arXiv:2303.05038, 2023.Google Scholar
Quiniou, R., et al. Application of ILP to cardiac arrhythmia characterization for chronicle recognition. In ILP. 2001.CrossRefGoogle Scholar
Rabideau, G., et al. Iterative repair planning for spacecraft operations in the ASPEN system. In i-SAIRAS, 1999.Google Scholar
Radford, A., et al. Learning transferable visual models from natural language supervision. In ICML, 2021.Google Scholar
Raghavan, A. N., et al. Bidirectional online probabilistic planning. In ICAPS, 2012.Google Scholar
Rajan, K. and Py, F.. T-REX: Partitioned inference for AUV mission control. In Further Advances in Unmanned Marine Vehicles. 2012.CrossRefGoogle Scholar
Rajan, K. and Saffiotti, A., editors. Special Issue on AI and Robotics. AIJ, 2017.Google Scholar
Rajan, K., et al. Towards deliberative control in marine robotics. In Marine Robot Autonomy. 2012.CrossRefGoogle Scholar
Rajvanshi, A., et al. Saynav: Grounding large language models for dynamic planning to navigation in new environments. In ICAPS, 2024.CrossRefGoogle Scholar
Ramirez, M. and Geffner, H.. Probabilistic plan recognition using off-the-shelf classical planners. In AAAI, 2010.CrossRefGoogle Scholar
Ramírez, M. and Sardina, S.. Directed fixed-point regression-based planning for non-deterministic domains. In ICAPS, 2014.CrossRefGoogle Scholar
Ramírez, M., et al. Behavior composition as fully observable non-deterministic planning. In ICAPS, 2013.CrossRefGoogle Scholar
Ramon, J., et al. Transfer learning in reinforcement learning problems through partial policy recycling. In ECML, 2007.Google Scholar
Rao, D., et al. Performance of the RDDL planners. In IEEE Int. Conf. of Online Analysis and Computing Science, 2016.CrossRefGoogle Scholar
Ravichandar, H., et al. Recent advances in robot learning from demonstration. ARCRAS, 2020.CrossRefGoogle Scholar
Reddy, S., et al. Sqil: Imitation learning via reinforcement learning with sparse rewards. arXiv:1905.11108, 2019.Google Scholar
Reif, J. H.. Complexity of the mover’s problem and generalizations. In IEEE Symposium on Foundations of Computer Science, 1979.CrossRefGoogle Scholar
Richter, S. and Westphal, M.. The LAMA planner: Guiding cost-based anytime planning with landmarks. JAIR, 2010.CrossRefGoogle Scholar
Richter, S., et al. Landmarks revisited. In AAAI, 2008.Google Scholar
Riedmiller, M., et al. Reinforcement learning for robot soccer. Autonomous Robots, 2009.CrossRefGoogle Scholar
Rintanen, J.. Constructing conditional plans by a theorem-prover. JAIR, 1999.CrossRefGoogle Scholar
Rintanen, J.. An iterative algorithm for synthesizing invariants. In AAAI, 2000.Google Scholar
Rintanen, J.. Backward plan construction for planning as search in belief space. In AIPS, 2002.Google Scholar
Rintanen, J.. Conditional planning in the discrete belief space. In IJCAI, 2005.Google Scholar
Rintanen, J.. Planning as satisfiability: Heuristics. AIJ, 2012.CrossRefGoogle Scholar
Rintanen, J.. Madagascar: Scalable planning with SAT. Int. Planning Competition, 2014.Google Scholar
Robert, D., et al. Learning multi-view aggregation in the wild for large-scale 3d semantic segmentation. In CVPR, 2022.CrossRefGoogle Scholar
Rodrigues, C., et al. Incremental learning of relational action models in noisy environments. In ILP, 2010.CrossRefGoogle Scholar
Rodrigues, C., et al. Incremental learning of relational action rules. In IEEE Int. Conf. on Machine Learning and Applications, 2010.CrossRefGoogle Scholar
Rodrigues, C., et al. Active learning of relational action models. In ILP, 2011.CrossRefGoogle Scholar
Rodriguez, I. D., et al. Learning first-order representations for planning from black box states: New results. In KR, 2021.CrossRefGoogle Scholar
Rodriguez, I. D., et al. Flexible FOND planning with explicit fairness assumptions. JAIR, 2022.CrossRefGoogle Scholar
Rodriguez-Moreno, M. D., et al. IPSS: A hybrid approach to planning and scheduling integration. TDKE, 2006.CrossRefGoogle Scholar
Röger, G. and Helmert, M.. The more, the merrier: Combining heuristic estimators for satisficing planning. In ICAPS, 2010.CrossRefGoogle Scholar
Röger, G., et al. Optimal planning in the presence of conditional effects: Extending LM-Cut with context-splitting. In ECAI, 2014.Google Scholar
Roijers, D. M. and Whiteson, S.. Multi-Objective Decision Making. Morgan & Claypool, 2017.Google Scholar
Ross, S. and Pineau, J.. Model-based bayesian reinforcement learning in large structured domains. In UAI, 2008.Google Scholar
Ross, S., et al. Online planning algorithms for POMDPs. JAIR, 2008.CrossRefGoogle Scholar
Rossetti, N., et al. Learning general policies for planning through GPT models. In ICAPS, 2024.CrossRefGoogle Scholar
Rudin, N., et al. Advanced skills by learning locomotion and local navigation end-to-end. arXiv:2209.12827, 2022.Google Scholar
Rummery, G. A. and Niranjan, M.. On-line Q-learning using connectionist systems. Technical report, Cambridge Univ., 1994.Google Scholar
Russell, S.. Learning agents for uncertain environments. In Annual Conf. on Computational Learning Theory, 1998.CrossRefGoogle Scholar
Russell, S.. Human compatible: AI and the problem of control. Penguin, 2019.Google Scholar
Russell, S. and Norvig, P.. Artificial Intelligence: A modern approach (4th Edition). Pearson, 2021.Google Scholar
Sabbadin, R.. A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. arXiv:1301.6736, 2013.Google Scholar
Sacerdoti, E.. Planning in a hierarchy of abstraction spaces. AIJ, 1974.CrossRefGoogle Scholar
Sacerdoti, E.. The nonlinear nature of plans. In IJCAI, 1975.Google Scholar
Sacerdoti, E.. A Structure for Plans and Behavior. American Elsevier, 1977.Google Scholar
Samadi, M., et al. Learning from multiple heuristics. In AAAI, 2008.Google Scholar
Samadi, M., et al. Using the Web to interactively learn to find objects. In AAAI, 2012.Google Scholar
Samet, H.. Foundations of Multidimensional and Metric Data Structures. Morgan Kauffmann, 2006.Google Scholar
Samsi, S., et al. From words to watts: Benchmarking the energy costs of large language model inference. In IEEE High Performance Extreme Computing Conference, 2023.CrossRefGoogle Scholar
Samuel, A. L.. Some studies in machine learning using the game of checkers: II— recent progress. IBM Jour. of Research and Development, 1959.CrossRefGoogle Scholar
Sandewall, E.. Features and Fluents: The Representation of Knowledge about Dynamical Systems. Oxford Univ. Press, 1994.Google Scholar
Sandewall, E. and Rönnquist, R.. A representation of action structures. In AAAI, 1986.Google Scholar
Sanner, S.. Relational dynamic influence diagram language (RDDL): Language description. Technical report, NICTA, 2010.Google Scholar
Santana, P. H. R. Q. A. and Williams, B. C.. Chance-constrained consistency for probabilistic temporal plan networks. In ICAPS, Nov. 2014.Google Scholar
Sardiña, S., et al. Hierarchical planning in BDI agent programming languages: A formal approach. In AAMAS, May 2006.CrossRefGoogle Scholar
Scala, E. and Grastien, A.. Nondeterministic conformant planning using a counterexample-guided incremental compilation to classical planning. In ICAPS, 2021.CrossRefGoogle Scholar
Scala, E., et al. Landmarks for numeric planning problems. In IJCAI, 2017.CrossRefGoogle Scholar
Schaul, T., et al. Universal value function approximators. In ICML, 2015.Google Scholar
Scheck, S., et al. Knowledge compilation for nondeterministic action languages. In ICAPS, 2021.CrossRefGoogle Scholar
Scheide, E., et al. Behavior tree learning for robotic task planning through monte carlo DAG search over a formal grammar. In ICRA, 2021.CrossRefGoogle Scholar
Scherrer, B. and Lesner, B.. On the use of non-stationary policies for stationary infinite-horizon Markov decision processes. In NeurIPS, 2012.Google Scholar
Schick, T., et al. Toolformer: Language models can teach themselves to use tools. arXiv:2302.04761, 2023.Google Scholar
Schreiber, D.. Lilotane: A lifted SATbased approach to hierarchical planning. JAIR, 2021.CrossRefGoogle Scholar
Schulman, J., et al. Trust region policy optimization. In ICML, 2015.Google Scholar
Schulman, J., et al. Proximal policy optimization algorithms. arXiv:1707.06347, 2017.Google Scholar
Schultz, D. G. and Melsa, J. L.. State Functions and Linear Control Systems. McGraw-Hill, 1967.Google Scholar
Schwartz, J. T. and Sharir, M.. On the “piano movers” problem. General techniques for computing topological properties of real algebraic manifolds. Advances in Applied Mathematics, 1983.CrossRefGoogle Scholar
Seipp, J., et al. Saturated cost partitioning for optimal classical planning. JAIR, 2020.CrossRefGoogle Scholar
Serafini, L. and Garcez, A. A.. Logic tensor networks: Deep learning and logical reasoning from data and knowledge. arXiv:1606.04422, 2016.Google Scholar
Serafini, L. and Traverso, P.. Learning abstract planning domains and mappings to real world perceptions. In Int. Conf. of the Italian Association for Artificial Intelligence, 2019.CrossRefGoogle Scholar
Shah, S., et al. Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In Field and Service Robotics, 2018.CrossRefGoogle Scholar
Shani, G., et al. A survey of point-based POMDP solvers. JAAMAS, 2012.CrossRefGoogle Scholar
Shannon, C. E.. Programming a computer for playing chess. Philosophical Magazine and Jour. of Science, 1950.CrossRefGoogle Scholar
Shannon, C. E.. Presentation of a mazesolving machine. In Cybernetics, Trans. of the Eighth Conf., 1951.Google Scholar
Shannon, C. E.. Prediction and entropy of printed english. Bell System Technical Jour., 1951.CrossRefGoogle Scholar
Shaparau, D., et al. Contingent planning with goal preferences. In AAAI, 2006.Google Scholar
Shaparau, D., et al. Fusing procedural and declarative planning goals for nondeterministic domains. In AAAI, 2008.Google Scholar
Sharma, A., et al. Dynamics-aware unsupervised discovery of skills. arXiv:1907.01657, 2019.Google Scholar
Sharma, P., et al. Skill induction and planning with latent language. arXiv:2110.01517, 2021.Google Scholar
Shen, W., et al. Learning domainindependent planning heuristics with hypergraph networks. In ICAPS, 2020.CrossRefGoogle Scholar
Shi, H., et al. Continual learning of large language models: A comprehensive survey. arXiv:2404.16789, 2024.Google Scholar
Shinn, N., et al. Reflexion: An autonomous agent with dynamic memory and selfreflection. arXiv:2303.11366, 2023.Google Scholar
Shivashankar, V., et al. A hierarchical goal-based formalism and algorithm for single-agent planning. In AAMAS, 2012.Google Scholar
Shivashankar, V., et al. The GoDeL planning system: A more perfect union of domain-independent and hierarchical planning. In IJCAI, 2013.Google Scholar
Shoahm, Y. and McDermott, D.. Problems in formal temporal reasoning. AIJ, 1988.CrossRefGoogle Scholar
Shoham, Y.. Temporal logic in AI: Semantical and ontological considerations. AIJ, 1987.CrossRefGoogle Scholar
Shtutland, L., et al. Unavoidable deadends in deterministic partially observable contingent planning. JAAMAS, 2023.CrossRefGoogle Scholar
Siciliano, B. and Khatib, O., editors. The Handbook of Robotics. Springer, 2008.Google Scholar
Silver, D. and Veness, J.. Monte-Carlo planning in large POMDPs. In NeurIPS, 2010.Google Scholar
Silver, D., et al. Learning from demonstration for autonomous navigation in complex unstructured terrain. IJRR, 2010.CrossRefGoogle Scholar
Silver, D., et al. Deterministic policy gradient algorithms. In ICML, 2014.Google Scholar
Silver, D., et al. Mastering the game of go with deep neural networks and tree search. Nature, 2016.CrossRefGoogle Scholar
Silver, D., et al. Mastering the game of go without human knowledge. Nature, 2017.CrossRefGoogle Scholar
Silver, D., et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 2018.CrossRefGoogle Scholar
Silver, T., et al. Generalized planning in PDDL domains with pretrained large language models. arXiv:2305.11014, 2023.Google Scholar
Siméon, T., et al. Visibility-based probabilistic roadmaps for motion planning. Advanced Robotics, 2000.CrossRefGoogle Scholar
Siméon, T., et al. Manipulation planning with probabilistic roadmaps. IJRR, 2004.Google Scholar
Simmons, R.. Concurrent planning and execution for autonomous robots. IEEE Control Systems, 1992.Google Scholar
Simmons, R.. Structured control for autonomous robots. TRA, 1994.CrossRefGoogle Scholar
Simmons, R. and Apfelbaum, D.. A task description language for robot control. In IROS, 1998.Google Scholar
Simmons, R. and Davis, R.. Generate, test and debug: Combining associational rules and causal models. In IJCAI, 1987.Google Scholar
Simpkins, C., et al. Towards adaptive programming: Integrating reinforcement learning into a programming language. In ACM Object-Oriented Programming Systems Languages and Applications, 2008.CrossRefGoogle Scholar
Singh, I., et al. Progprompt: Program generation for situated robot task planning using large language models. Autonomous Robots, 2023.CrossRefGoogle Scholar
Singh, S. P. and Sutton, R. S.. Reinforcement learning with replacing eligibility traces. ML, 1996.CrossRefGoogle Scholar
Sirin, E., et al. HTN planning for Web service composition using SHOP2. Jour. of Web Semantics, 2004.CrossRefGoogle Scholar
Skinner, B. F.. The Behavior of Organisms: An Experimental Analysis. Appleton, 1938.Google Scholar
Smith, D. E. and Weld, D.. Temporal planning with mutual exclusion reasoning. In IJCAI, 1999.Google Scholar
Smith, D. E. and Weld, D. S.. Conformant Graphplan. In AAAI, 1998.Google Scholar
Smith, D. E. and Weld, D. S.. Temporal planning with mutual exclusion reasoning. In IJCAI, 1999.Google Scholar
Smith, D. E., et al. Bridging the gap between planning and scheduling. KER, 2000.CrossRefGoogle Scholar
Smith, D. E., et al. The ANML language. In KEPS, 2008.Google Scholar
Smith, L., et al. A walk in the park: Learning to walk in 20 minutes with model-free reinforcement learning. arXiv:2208.07860, 2022.Google Scholar
Smith, S. J. J., et al. Integrating electrical and mechanical design and process planning. In Knowledge Intensive CAD. 1997.CrossRefGoogle Scholar
Smith, S. J. J., et al. Computer bridge: A big win for AI planning. AIMag, 1998.Google Scholar
Smith, T. and Simmons, R.. Heuristic search value iteration for POMDPs. In UAI, 2004.Google Scholar
Sohrabi, S. and McIlraith, S. A.. Preferencebased web service composition: A middle ground between execution and search. In Int. Semantic Web Conf. 2010.CrossRefGoogle Scholar
Sohrabi, S., et al. HTN planning with preferences. In IJCAI, 2009.Google Scholar
Sohrabi, S., et al. HTN planning for the composition of stream processing applications. In ICAPS, 2013.CrossRefGoogle Scholar
Somani, A., et al. DESPOT: Online POMDP planning with regularization. In NeurIPS, 2013.Google Scholar
Song, J., et al. Self-refined large language model as automated reward function designer for deep reinforcement learning in robotics. arXiv:2309.06687, 2023.Google Scholar
Sprague, C. I., et al. Improving the modularity of AUV control systems using behaviour trees. In IEEE/OES Autonomous Underwater Vehicle Wksh., 2018.CrossRefGoogle Scholar
Sridharan, M., et al. HiPPo: Hierarchical POMDPs for planning information processing and sensing actions on a robot. In ICAPS, 2008.Google Scholar
Srivastava, B.. Realplan: Decoupling causal and resource reasoning in planning. In AAAI, 2000.Google Scholar
Srivastava, N., et al. Dropout: A simple way to prevent neural networks from overfitting. JMLR, 2014.Google Scholar
Stachniss, C. and Burgard, W.. Exploration with active loop-closing for FastSLAM. In IROS, 2004.Google Scholar
Ståhlberg, S., et al. Learning generalized policies without supervision using GNNs. In KR, 2022.CrossRefGoogle Scholar
Ståhlberg, S., et al. Learning general policies with policy gradient methods. In KR, 2023.CrossRefGoogle Scholar
Stasse, O., et al. TALOS: A new humanoid research platform targeted for industrial applications. In Humanoids, 2017.CrossRefGoogle Scholar
Stedl, J. and Williams, B.. A fast incremental dynamic controllability algorithm. In ICAPS Wksh. on Plan Execution, 2005.Google Scholar
Steinmetz, M., et al. Goal probability analysis in probabilistic planning: Exploring and enhancing the state of the art. JAIR, 2016.CrossRefGoogle Scholar
Stern, R. and Juba, B.. Efficient, safe, and probably approximately complete learning of action models. In IJCAI, 2017.CrossRefGoogle Scholar
Stock, S., et al. Hierarchical hybrid planning in a mobile service robot. In KI, 2015.CrossRefGoogle Scholar
Stock, S., et al. Online task merging with a hierarchical hybrid task planner for mobile service robots. In IROS, 2015.CrossRefGoogle Scholar
Strehl, A. L., et al. Efficient structure learning in factored-state MDPs. In AAAI, 2007.Google Scholar
Styrud, J., et al. Combining planning and learning of behavior trees for robotic assembly. In ICRA, 2022.CrossRefGoogle Scholar
Su, Z., et al. Learning manipulation graphs from demonstrations using multimodal sensory signals. In ICRA, 2018.CrossRefGoogle Scholar
Subramanian, S., et al. Reclip: A strong zero-shot baseline for referring expression comprehension. arXiv:2204.05991, 2022.Google Scholar
Sun, Z., et al. Human action recognition from various data modalities: A review. PAMI, 2022.CrossRefGoogle Scholar
Sunberg, Z. and Kochenderfer, M.. Online algorithms for POMDPs with continuous state, action, and observation spaces. In ICAPS, 2018.CrossRefGoogle Scholar
Suomalainen, M., et al. A survey of robot manipulation in contact. RAS, 2022.CrossRefGoogle Scholar
Surmann, H., et al. An autonomous mobile robot with a 3D laser range finder for 3D exploration and digitalization of indoor environments. RAS, 2003.CrossRefGoogle Scholar
Sutton, R. S.. Temporal credit assignment in reinforcement learning. PhD thesis, Univ. of Massachusetts Amherst, 1984.Google Scholar
Sutton, R. S.. Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bulletin, 1991.CrossRefGoogle Scholar
Sutton, R. S. and Barto, A. G.. Reinforcement learning: An introduction. MIT press, 2018.Google Scholar
Sutton, R. S., et al. Policy gradient methods for reinforcement learning with function approximation. In NeurIPS, 1999.Google Scholar
Szepesvári, C. and Smart, W. D.. Interpolation-based Q-learning. In ICML, 2004.CrossRefGoogle Scholar
Szita, I. and Lörincz, A.. Learning Tetris using the noisy cross-entropy method. Neural Computation, 2006.CrossRefGoogle Scholar
Tadepalli, P., et al. Relational reinforcement learning: An overview. In ICML Wksh. on Relational Reinforcement Learning, 2004.Google Scholar
Taha, H. A.. Integer Programming: Theory, Applications, and Computations. Academic Press, 1975.Google Scholar
Tang, C., et al. GraspGPT: Leveraging semantic knowledge from a large language model for task-oriented grasping. arXiv:2307.13204, 2023.Google Scholar
Tanneberg, D. and Gienger, M.. Learning type-generalized actions for symbolic planning. In IROS, 2023.CrossRefGoogle Scholar
Tarjan, R. E.. Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1972.CrossRefGoogle Scholar
Tate, A.. Generating project networks. In IJCAI, 1977.Google Scholar
Tate, A., et al. O-Plan2: An Architecture for Command, Planning and Control. Morgan-Kaufmann, 1994.Google Scholar
Täubig, H., et al. Medical robotics and computer-integrated surgery. In Handbook of Robotics. Springer, 2008.Google Scholar
Taylor, M. E. and Stone, P.. Transfer learning for reinforcement learning domains: A survey. JMLR, 2009.CrossRefGoogle Scholar
Taylor, M. E., et al. Transfer learning via inter-task mappings for temporal difference learning. JMLR, 2007.CrossRefGoogle Scholar
Teh, Y., et al. Distral: Robust multitask reinforcement learning. In NeurIPS, 2017.Google Scholar
Teichteil-Königsbuch, F.. Fast incremental policy compilation from plans in hybrid probabilistic domains. In ICAPS, 2012.CrossRefGoogle Scholar
Teichteil-Königsbuch, F.. Stochastic safest and shortest path problems. In AAAI, 2012.Google Scholar
Teichteil-Königsbuch, F., et al. RFF: A robust, FF-based MDP planning algorithm for generating policies with low probability of failure. In ICAPS, 2008.Google Scholar
Teichteil-Königsbuch, F., et al. Incremental plan aggregation for generating policies in MDPs. In AAMAS, 2010.Google Scholar
Teichteil-Königsbuch, F., et al. Extending classical planning heuristics to probabilistic planning with dead-ends. In AAAI, 2011.CrossRefGoogle Scholar
Teng, S., et al. Motion planning for autonomous driving: The state of the art and future perspectives. TIV, 2023.CrossRefGoogle Scholar
Tesauro, G.. TD-Gammon, a self-teaching backgammon program, achieves masterlevel play. Neural Computation, 1994.CrossRefGoogle Scholar
Tesauro, G. et al. Temporal difference learning and TD-Gammon. CACM, 1995.CrossRefGoogle Scholar
Thayer, J. T., et al. Learning inadmissible heuristics during search. In ICAPS, 2011.CrossRefGoogle Scholar
Thompson, N. C., et al. The computational limits of deep learning. arXiv:2007.05558, 2022.Google Scholar
Thorndike, E. L.. Animal Intelligence. Macmillan, 1911.Google Scholar
Thrun, S.. Robotic mapping: A survey. In Lakemeyer, G. and Nebel, B., editors, Exploring Artificial Intelligence in the New Millenium. Morgan Kaufmann, 2002.Google Scholar
Thrun, S.. Stanley: The robot that won the DARPA grand challenge. JFR, 2006.CrossRefGoogle Scholar
Todorov, E., et al. Mujoco: A physics engine for model-based control. In IROS, 2012.CrossRefGoogle Scholar
Tola, D. and Corke, P.. Understanding URDF: A survey based on user experience. arXiv:2302.13442, 2023.Google Scholar
Torrey, L. and Shavlik, J.. Transfer learning. In Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques. IGI Global, 2010.Google Scholar
Toussaint, M.. Logic-geometric programming an optimization-based approach to combined task and motion planning. In IJCAI, 2015.Google Scholar
Trevizan, F. W., et al. Heuristic search in dual space for constrained stochastic shortest path problems. In ICAPS, 2016.CrossRefGoogle Scholar
Trevizan, F. W., et al. Efficient solutions for stochastic shortest path problems with dead ends. In UAI, 2017.Google Scholar
Trevizan, F. W., et al. Occupation measure heuristics for probabilistic planning. In ICAPS, 2017.CrossRefGoogle Scholar
Triggs, B., et al. Bundle adjustment – a modern synthesis. In Int. Wksh. on Vision Algorithms, 1999.CrossRefGoogle Scholar
Trott, A., et al. Keeping your distance: Solving sparse reward tasks using selfbalancing shaped rewards. In NeurIPS, 2019.Google Scholar
Tuffield, P. and Elias, H.. The Shadow robot mimics human actions. Industrial Robot: An Int. Jour., 2003.CrossRefGoogle Scholar
Turing, A.. Computing machinery and intelligence. Mind, 1950.CrossRefGoogle Scholar
UN AI Advisory Board. Governing AI for humanity, interim report, 2023.Google Scholar
Vaandrager, F. W.. Active automata learning: From l* to l#. In FMCAD, 2021.Google Scholar
Valmeekam, K., et al. On the planning abilities of large language models (a critical investigation with a proposed benchmark). arXiv:2302.06706, 2023.Google Scholar
Van de Ven, G. M., et al. Brain-inspired replay for continual learning with artificial neural networks. Nature Communications, 2020.CrossRefGoogle Scholar
van den Briel, M., et al. Loosely coupled formulations for automated planning: An integer programming perspective. JAIR, 2008.CrossRefGoogle Scholar
Van Der Krogt, R. and De Weerdt, M.. Plan repair as an extension of planning. In ICAPS, 2005.Google Scholar
Vardi, M. Y.. An automata-theoretic approach to fair realizability and synthesis. In CAV, 1995.CrossRefGoogle Scholar
Vardi, M. Y.. An automata-theoretic approach to linear temporal logic. In Banff Higher Order Wksh., 1995.CrossRefGoogle Scholar
Vardi, M. Y.. From verification to synthesis. In Int. Conf. on Verified Software: Theories, Tools, Experiments, 2008.Google Scholar
Vásquez, J. W., et al. Enhanced chronicle learning for process supervision. IFAC, 2017.CrossRefGoogle Scholar
Vaswani, A., et al. Attention is all you need. In NeurIPS, 2017.Google Scholar
Vattam, S., et al. Breadth of approaches to goal reasoning: A research survey. In ACS Wksh. on Goal Reasoning, 2013.Google Scholar
Vecerik, M., et al. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. Computing Research Repository, 2017.Google Scholar
Velez, J., et al. Planning to perceive: Exploiting mobility for robust object detection. In ICAPS, 2011.CrossRefGoogle Scholar
Veloso, M. and Stone, P.. FLECS: Planning with a flexible commitment strategy. JAIR, 1995.CrossRefGoogle Scholar
Veloso, M. M. and Carbonell, J.. Derivational analogy in PRODIGY: Automating case acquisition, storage and utilization. ML, 1993.CrossRefGoogle Scholar
Veloso, M. M. and Rizzo, P.. Mapping planning actions and partially-ordered plans into execution knowledge. In Wksh. on Integrating Planning, Scheduling and Execution in Dynamic and Uncertain Environments, 1998.Google Scholar
Veloso, M. M., et al. Integrating planning and learning: The PRODIGY architecture. JETAI, 1995.CrossRefGoogle Scholar
Vemprala, S., et al. ChatGPT for robotics: Design principles and model abilities. Microsoft Autonomous Systems and Robotics Research, 2023.CrossRefGoogle Scholar
Vere, S.. Planning in time: Windows and duration for activities and goals. PAMI, 1983.CrossRefGoogle Scholar
Verfaillie, G., et al. How to model planning and scheduling problems using constraint networks on timelines. KER, 2010.CrossRefGoogle Scholar
Verginis, C. K., et al. KDF: Kinodynamic motion planning via geometric samplingbased algorithms and funnel control. TRO, 2023.CrossRefGoogle Scholar
Verma, A., et al. Programmatically interpretable reinforcement learning. In ICML, 2018.Google Scholar
Verma, A., et al. Imitation-projected programmatic reinforcement learning. In NeurIPS, 2019.Google Scholar
Verma, A. et al. Programmatic reinforcement learning. PhD thesis, Univ. of Texas at Austin, 2021.Google Scholar
Verma, M., et al. Preference proxies: Evaluating large language models in capturing human preferences in human-ai tasks. In ICML Wksh. on The Many Facets of Preference-Based Learning, 2023.Google Scholar
Verma, P., et al. Automatic generation of behavior trees for the execution of robotic manipulation tasks. In IEEE Int. Conf. on Emerging Technologies and Factory Automation, 2021.CrossRefGoogle Scholar
Verma, V., et al. Plan execution interchange language (PLEXIL) for executable plans and command sequences. In i-SAIRAS, 2005.Google Scholar
Verweij, T.. A hierarchically-layered multiplayer bot system for a first-person shooter. Master’s thesis, Vrije Univ. of Amsterdam, 2007.Google Scholar
Vidal, T. and Fargier, H.. Handling contingency in temporal constraint networks: From consistency to controllabilities. JETAI, 1999.CrossRefGoogle Scholar
Vidal, T. and Ghallab, M.. Dealing with uncertain durations in temporal constraints networks dedicated to planning. In ECAI, 1996.Google Scholar
Vilain, M. and Kautz, H.. Constraint propagation algorithms for temporal reasoning. In AAAI, 1986.Google Scholar
Vilain, M., et al. Constraint propagation algorithms for temporal reasoning: A revised report. In Readings in Qualitative Reasoning about Physical Systems. 1989.CrossRefGoogle Scholar
Vinyals, O., et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 2019.CrossRefGoogle Scholar
Vodopivec, T., et al. On monte carlo tree search and reinforcement learning. JAIR, 2017.CrossRefGoogle Scholar
Vu, V.-T., et al. Automatic video interpretation: A novel algorithm for temporal scenario recognition. In IJCAI, 2003.CrossRefGoogle Scholar
Waldinger, R.. Achieving several goals simultaneously. Machine Intelligence. 1977.Google Scholar
Walsh, T., et al. Integrating sample-based planning and model-based reinforcement learning. In AAAI, 2010.CrossRefGoogle Scholar
Wang, C., et al. Large language models for multi-modal human-robot interaction. arXiv:2401.15174, 2024.Google Scholar
Wang, F. Y., et al. A Petri-net coordination model for an intelligent mobile robot. SMC, 1991.CrossRefGoogle Scholar
Wang, G., et al. Voyager: An open-ended embodied agent with large language models. arXiv:2305.16291v, 2023.Google Scholar
Wang, L., et al. A survey on large language model based autonomous agents. arXiv:2308.11432, 2023.Google Scholar
Wang, L., et al. Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models. arXiv:2305.04091, 2023.Google Scholar
Wang, T., et al. Nervenet: Learning structured policy with graph neural networks. In ICLR, 2018.Google Scholar
Wang, X.. Planning while learning operators. In AAAI, 1996.Google Scholar
Wang, Z., et al. Incremental reinforcement learning with prioritized sweeping for dynamic environments. IEEE/ASME Trans. on Mechatronics, 2019.CrossRefGoogle Scholar
Warfield, I., et al. Adaptation of hierarchical task network plans. In FLAIRS, 2007.Google Scholar
Warnell, G., et al. Deep TAMER: Interactive agent shaping in high-dimensional state spaces. In AAAI, 2018.CrossRefGoogle Scholar
Warren, D. H. D.. Generating conditional plans and programs. In Summer Conf. on Artificial Intelligence and Simulation of Behaviour, 1976.Google Scholar
Watkins, C. and Dayan, P.. Q-learning. ML, 1992.Google Scholar
Watkins, O., et al. Teachable reinforcement learning via advice distillation. In NeurIPS, 2021.Google Scholar
Wei, J., et al. Emergent abilities of large language models. arXiv:2206.07682, 2022.Google Scholar
Weiring, M. and Otterlo, M.. Reinforcement Learning. Springer, 2012.CrossRefGoogle Scholar
Weld, D.. Recent advances in AI planning. AIMag, 1999.Google Scholar
Weld, D. S.. An introduction to least commitment planning. AIMag, 1994.Google Scholar
Weld, D. S. and Etzioni, O.. The first law of robotics (a call to arms). In AAAI, 1994.Google Scholar
Weld, D. S., et al. Extending Graphplan to handle uncertainty and sensing actions. In AAAI, 1998.Google Scholar
Wells, A. M., et al. Learning feasibility for task and motion planning in tabletop environments. IEEE Robotics and Automation Letters, 2019.CrossRefGoogle Scholar
Werbos, P.. Advanced forecasting methods for global crisis warning and models of intelligence. General System Yearbook, 1977.Google Scholar
White, D. J.. Multi-objective infinitehorizon discounted markov decision processes. Journal of Mathematical Analysis and Applications, 1982.CrossRefGoogle Scholar
Wilkins, D.. Recovering from execution errors in SIPE. CI, 1985.CrossRefGoogle Scholar
Wilkins, D. and desJardins, M.. A call for knowledge-based planning. AIMag, 2001.Google Scholar
Wilkins, D. E.. Practical Planning: Extending the Classical AI Planning Paradigm. Morgan Kaufmann, 1988.Google Scholar
Wilkins, D. E.. Can AI planners solve practical problems? CI, 1990.CrossRefGoogle Scholar
Wilkins, D. E. and Myers, K. L.. A common knowledge representation for plan generation and reactive execution. Jour. of Logic and Computation, 1995.CrossRefGoogle Scholar
Williams, B. C. and Abramson, M.. Executing reactive, model-based programs through graph-based temporal planning. In IJCAI, 2001.Google Scholar
Williams, B. C. and Nayak, P. P.. A model-based approach to reactive self-configuring systems. In AAAI, 1996.Google Scholar
Williams, R. J.. Simple statistical gradientfollowing algorithms for connectionist reinforcement learning. ML, 1992.CrossRefGoogle Scholar
Wilson, A., et al. Multi-task reinforcement learning: A hierarchical bayesian approach. In ICML, 2007.CrossRefGoogle Scholar
Wilson, M. and Aha, D. W.. A goal reasoning model for autonomous underwater vehicles. In Proc. Goal Reasoning Wksh., 2021.Google Scholar
Wilt, C. M. and Ruml, W.. When does weighted A* fail? In SOCS. 2012.Google Scholar
Wilt, C. M. and Ruml, W.. Building a heuristic for greedy search. In SOCS, 2015.Google Scholar
Wingate, D. and Seppi, K. D.. Prioritization methods for accelerating MDP solvers. JMLR, 2005.Google Scholar
Wu, Y. and Huang, T. S.. Visionbased gesture recognition: A review. In Gesture-Based Communication in Human-Computer Interaction. 1999.CrossRefGoogle Scholar
Xi, K., et al. Neuro-symbolic learning of lifted action models from visual traces. In ICAPS, 2024.CrossRefGoogle Scholar
Xiao, S., et al. Model-guided synthesis for LTL over finite traces. In Int. Conf. on Verification, Model Checking, and Abstract Interpretation, 2024.CrossRefGoogle Scholar
Xiao, X., et al. Motion planning and control for mobile robot navigation using machine learning: A survey. Autonomous Robots, 2022.CrossRefGoogle Scholar
Xiao, Z., et al. Refining HTN methods via task insertion with preferences. In AAAI, 2020.CrossRefGoogle Scholar
Xie, F., et al. Understanding and improving local exploration for GBFS. In ICAPS, 2015.CrossRefGoogle Scholar
Xie, T., et al. Text2reward: Automated dense reward function generation for reinforcement learning. arXiv:2309.11489, 2023.Google Scholar
Xie, Y., et al. Translating natural language to planning goals with large-language models. arXiv:2302.05128, 2023.Google Scholar
Xiong, H., et al. Deterministic policy gradient: Convergence analysis. In UAI, 2022.Google Scholar
Xu, J. Z. and Laird, J. E.. Instance-based online learning of deterministic relational action models. In AAAI, 2010.CrossRefGoogle Scholar
Xu, J. Z. and Laird, J. E.. Combining learned discrete and continuous action models. In AAAI, 2011.CrossRefGoogle Scholar
Xu, J. Z. and Laird, J. E.. Learning integrated symbolic and continuous action models for continuous domains. In AAAI, 2013.CrossRefGoogle Scholar
Xu, L., et al. Accelerating integrated task and motion planning with neural feasibility checking. arXiv:2203.10568, 2022.Google Scholar
Xu, M., et al. A simple baseline for open-vocabulary semantic segmentation with pre-trained vision-language model. In ECCV, 2022.CrossRefGoogle Scholar
Xu, Y., et al. Discriminative learning of beam-search heuristics for planning. In IJCAI, 2007.Google Scholar
Yang, K., et al. If LLM is the wizard, then code is the wand: A survey on how code empowers large language models to serve as intelligent agents. arXiv:2401.00812, 2024.Google Scholar
Yang, Q.. Formalizing planning knowledge for hierarchical planning. CI, 1990.CrossRefGoogle Scholar
Yang, Q.. Intelligent Planning: A Decomposition and Abstraction Based Approach. Springer, 1997.CrossRefGoogle Scholar
Yang, Q., et al. Learning action models from plan examples using weighted MAX-SAT. AIJ, 2007.CrossRefGoogle Scholar
Yang, Z., et al. Hierarchical deep reinforcement learning for continuous action control. IEEE Trans. on Neural Networks and Learning Systems, 2018.CrossRefGoogle Scholar
Yang, Z., et al. Sequence-based plan feasibility prediction for efficient task and motion planning. arXiv:2211.01576, 2022.Google Scholar
Yao, S., et al. Tree of thoughts: Deliberate problem solving with large language models. arXiv:2305.10601, 2023.Google Scholar
Yao, Y., et al. Intention progression using quantitative summary information. In AAMAS, 2021.Google Scholar
Yi, Z., et al. The dynamic anchoring agent: A probabilistic object anchoring framework for semantic world modeling. In FLAIRS, 2024.CrossRefGoogle Scholar
Yoon, S. and Kambhampati, S.. Towards model-lite planning: A proposal for learning & planning with incomplete domain models. In ICAPS Wksh. on AI Planning and Learning, 2007.Google Scholar
Yoon, S., et al. Learning heuristic functions from relaxed plans. In ICAPS, 2006.Google Scholar
Yoon, S., et al. FF-Replan: A baseline for probabilistic planning. In ICAPS, 2007.Google Scholar
Yoon, S., et al. Probabilistic planning via determinization in hindsight. In AAAI, 2008.Google Scholar
Yoon, S. W., et al. Learning control knowledge for forward search planning. JMLR, 2008.Google Scholar
Yoshida, K. and Wilcox, B.. Space robots and systems. In Handbook of Robotics. Springer, 2008.Google Scholar
Younes, H. and Littman, M.. PPDDL: The probabilistic planning domain definition language. Technical report, Carnegie Mellon Univ., 2004.Google Scholar
Younes, H. and Simmons, R.. On the role of ground actions in refinement planning. In AIPS, 2002.Google Scholar
Younes, H. and Simmons, R.. VHPOP: Versatile heuristic partial order planner. JAIR, 2003.CrossRefGoogle Scholar
Younes, H. and Simmons, R.. Solving generalized semi-Markov decision processes using continuous phase-type distributions. In AAAI, 2004.Google Scholar
Younes, H. L. and Simmons, R. G.. Solving generalized semi-Markov decision processes using continuous phase-type distributions. In AAAI, 2004.Google Scholar
Yurtsever, E., et al. A survey of autonomous driving: Common practices and emerging technologies. IEEE Access, 2020.CrossRefGoogle Scholar
Zaidins, P., et al. Implicit dependency detection for HTN plan repair. In HPlan, 2023.Google Scholar
Zambaldi, V., et al. Relational deep reinforcement learning. arXiv:1806.01830, 2018.Google Scholar
Zela, A., et al. Understanding and robustifying differentiable architecture search. arXiv:1909.09656, 2019.Google Scholar
Zhang, C., et al. Large language models for human-robot interaction: A review. Biomimetic Intelligence and Robotics, 2023.CrossRefGoogle Scholar
Zhang, H., et al. Building cooperative embodied agents modularly with large language models. arXiv:2307.02485, 2023.Google Scholar
Zhang, J., et al. Vision-language models for vision tasks: A survey. arXiv:2304.00685, 2024.Google Scholar
Zhang, R., et al. Leveraging human guidance for deep reinforcement learning tasks. arXiv:1909.09906, 2019.Google Scholar
Zhang, W. and Dietterich, T. G.. A reinforcement learning approach to job-shop scheduling. In IJCAI, 1995.Google Scholar
Zhang, Y., et al. Efficient reinforcement learning from demonstration via bayesian network-based knowledge extraction. Computational Intelligence and Neuroscience, 2021.CrossRefGoogle Scholar
Zhao, P., et al. An in-depth survey of large language model-based artificial intelligence agents. arXiv:2309.14365, 2023.Google Scholar
Zhao, W., et al. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In IEEE Symposium Series on Computational Intelligence, 2020.CrossRefGoogle Scholar
Zhao, W. X., et al. A survey of large language models. arXiv:2303.18223, 2023.Google Scholar
Zhu, S. and Giacomo, G. D.. Synthesis of maximally permissive strategies for LTLf specifications. In IJCAI, 2022.CrossRefGoogle Scholar
Zhu, Z., et al. Transfer learning in deep reinforcement learning: A survey. PAMI, 2023.CrossRefGoogle Scholar
Zhuo, H. H., et al. Learning complex action models with quantifiers and logical implications. AIJ, 2010.CrossRefGoogle Scholar
Zhuo, H. H., et al. Refining incomplete planning domain models through plan traces. In IJCAI, 2013.Google Scholar
Zhuo, H. H., et al. Learning hierarchical task network domains from partially observed plan traces. AIJ, 2014.CrossRefGoogle Scholar
Ziebart, B. D., et al. Modeling interaction via the principle of maximum causal entropy. In ICML, 2010.Google Scholar
Zimmerman, T. and Kambhampati, S.. Learning-assisted automated planning: Looking back, taking stock, going forward. AIMAG, 2003.Google Scholar
Ziparo, V. A., et al. Petri net plans. JAAMAS, 2011.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • References
  • Malik Ghallab, LAAS-CNRS, Toulouse, Dana Nau, University of Maryland, College Park, Paolo Traverso, Fondazione Bruno Kessler, Trento, Italy
  • Foreword by Michela Milano, Università degli Studi, Bologna, Italy
  • Book: Acting, Planning, and Learning
  • Online publication: 19 May 2025
  • Chapter DOI: https://doi.org/10.1017/9781009579346.040
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • References
  • Malik Ghallab, LAAS-CNRS, Toulouse, Dana Nau, University of Maryland, College Park, Paolo Traverso, Fondazione Bruno Kessler, Trento, Italy
  • Foreword by Michela Milano, Università degli Studi, Bologna, Italy
  • Book: Acting, Planning, and Learning
  • Online publication: 19 May 2025
  • Chapter DOI: https://doi.org/10.1017/9781009579346.040
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • References
  • Malik Ghallab, LAAS-CNRS, Toulouse, Dana Nau, University of Maryland, College Park, Paolo Traverso, Fondazione Bruno Kessler, Trento, Italy
  • Foreword by Michela Milano, Università degli Studi, Bologna, Italy
  • Book: Acting, Planning, and Learning
  • Online publication: 19 May 2025
  • Chapter DOI: https://doi.org/10.1017/9781009579346.040
Available formats
×