1. Introduction
Multi-robot path planning (MRPP) (or alternatively, multi-agent path planning, MAPF) is a complex combinatorial optimization problem that plans collision-free paths for multi-robot systems. MRPP is important for many applications, such as warehouse automation, traffic management, and multi-robot autonomous exploration [Reference Li, Chen, Harabor, Stuckey and Koenig1, Reference Bi, Zhang, Wen, Pan, Zhang, Wang and Yuan2]. In comparison to single-robot path planning, MRPP exhibits the following characteristics: 1) high dimensionality. MRPP involves finding feasible paths for multiple robots simultaneously, leading to a higher dimensional problem; 2) NP-completeness. It has been proven to be an NP-complete problem [Reference Yu and LaValle3, Reference Huang, Li, Koenig and Dilkina4]; and 3) temporal and spatial entanglement. In MRPP, the coordination and synchronization of multiple robot movements in time and space must be considered simultaneously to avoid collisions and deadlocks.
Comprehensive research has been carried out on MRPP due to its numerous applications and challenges. Centralized methods have garnered substantial research attention due to their ability to guarantee high-quality solutions. Among these, search-based methods constitute the mainstream, wherein A* [Reference Ryan5], independence detection (ID) framework [Reference Standley6, Reference Standley and Korf7], M* [Reference Wagner and Choset8, Reference Ferner, Wagner and Choset9], and conflict-based search (CBS) [Reference Sharon, Stern, Felner and Sturtevant10] are recognized for their ability to ensure both optimality and completeness of solutions. CBS is a method based on increasing cost tree search [Reference Sharon, Stern, Goldenberg and Felner11], which has garnered significant attention from the research community due to its rapid search capabilities. There are many improvements based on CBS, such as improved CBS (ICBS) [Reference Boyarski, Felner, Stern, Sharon, Betzalel, Tolpin and Shimony12] and ICBS-h [Reference Felner, Li, Boyarski, Ma, Choren, Satish Kumar and Koenig13]. While these methods ensure optimality, they are often associated with lower efficiency. To expedite the search process, researchers have proposed bounded suboptimal search methods such as Enhanced Conflict-Based Search (ECBS) [Reference Barer, Sharon, Stern and Felner14] and Explicit Enhanced Conflict-Based Search (EECBS) [Reference Li, Ruml and Koenig15]. These methods have demonstrated improved search speeds while still providing reasonably good solutions. In addition to the above methods, another notable method in the search-based method is the use of Rubik Tables [Reference Szegedy and Yu16, Reference Guo and Yu17]. While suboptimal, it delivers $1.x$ guarantees on high-density grid environments.
Centralized methods rely on globally available information from all robots, but they may encounter challenges when applied to dynamic environments. Recognizing these drawbacks has spurred the development of distributed methods. Among these, velocity obstacles (VO) [Reference Fiorini and Shiller18], reciprocal velocity obstacles (RVO) [Reference Van den, Jur and Manocha19], and optimal reciprocal collision avoidance (ORCA) [Reference van den Berg, Guy, Lin and Manocha20] are prominent methods focusing on real-time collision avoidance for multiple robots in dynamic environments by interaction and perception. There are several improved variants and extensions of these methods, including non-holonomic ORCA [Reference Alonso-Mora, Breitenmose, Rufli, Beardsley and Siegwart21] and hybrid RVO [Reference Snape, van den Berg, Guy and Manocha22].
With the growing popularity of machine learning methods, researchers are also utilizing deep learning and reinforcement learning to address MRPP problems. For example, Yang Yang et al. have explored the application of deep Q-networks (DQN) to solve MRPP challenges [Reference Yang, Juntao and Lingling23]. Qingbiao Li et al. combine convolutional neural network and graph neural network for MRPP problem [Reference Li, Gama, Ribeiro and Porok24]. In addition, researchers have employed deep reinforcement learning methods to solve the MRPP problem [Reference Han, Chen, Wang, Zhang, Gao, Hao and Pan25, Reference Long, Fan, Liao, Liu, Zhang and Pan26].
In this study, we propose a novel regional heuristic framework over ECBS for MRPP, namely RH-ECBS. Before solving the MRPP problem, the map undergoes a regional partition. The objective of this region division is to alleviate the robot density within each region, achieving a more uniform distribution of robots. This approach significantly reduces the incidence of conflicts, enhancing the overall efficiency of the planning process. To summarize, the main contributions and novelties of this work are as follows:
-
1. A novel regional heuristic planning framework is proposed. The region of each robot is divided before planning so that the number of conflicts and deadlocks encountered during planning can be decreased.
-
2. In the planning stage, the number of robots in each subregion is considered when performing the A* search. Thus, the robot is guided to plan in the region with a small number of robots.
-
3. Formally proving the nonzero probability of obtaining a solution using the proposed method. Besides, the upper bound of the solution in a special case is given.
-
4. The comparative experiments were conducted with different numbers of robots, comparing RH-ECBS against the ECBS algorithm. The experimental results demonstrate that RH-ECBS outperforms ECBS in terms of efficiency while achieving comparable solution quality.
2. Problem formulation
The environment can be presented by the graph structure $G=\{V, E\}$ , where $V$ represents vertices and $E$ represents edge. Multi-robot systems are denoted by $\mathcal{R}=\{R, S, F\}$ , where $R=\{R_i\}$ represents the set of robots, $S=\{S_i\}$ represents the set of starting configurations and $F=\{F_i\}$ represents the set of final configurations. The solution of MRPP is defined as $\mathcal{P}=\{P_i\}$ , $P_i$ represents the path of robot $i$ .
In the process of multi-robot planning, time is discretized into equal time steps. During each time step, a robot can only move from the current vertex to the surrounding 4-connect or 8-connect vertices, or remain waiting at the current vertex. Additionally, two types of conflicts can occur during the planning process: vertex conflict and edge conflict. Fig 1 shows the two types of conflicts.
According to the above definition, the MRPP problem is defined as follows. Given the environment map $G=\{V, E\}$ , the multi-robot system $\mathcal{R}=\{R, S, F\}$ . The planner should find a path from $S_i$ to $F_i$ for each robot $R_i$ in $R$ while there should be no conflict between any two paths $P_i$ and $P_j$ in the path set $\mathcal{P}$ .
3. Method
The system consists of the following steps. First, the map is built as a two-layer map that consists of the high-resolution and low-resolution maps. The start and goal of the robot need to be mapped on the low-resolution map from the high-resolution map. Then, utilizing individual robot path planning to partition regions for each robot in a low-resolution map. The region needs to be mapped back to the high-resolution map. Finally, ECBS is used to plan the path for the multi-robot system in the high-resolution map. When in the low-level search of ECBS, the probability of the search method expanding to the respective regions of each robot is greater. And the probability is relative to the robot density. The diagram of the method is shown in Fig 2. The yellow parts are the novel partition. The detailed method is presented in the rest of this section.
3.1. Two-layer map and region division
In order to segment the grid map into regions, the original high-resolution grid map was downsampled to generate a low-resolution map. At the same time, the position information of the robot is mapped to the low-resolution map. The construction of the high-low-resolution double-layer map and the mapping relationship of the robot position are shown in Fig 3, where the left side is the original high-resolution map and the right side is the low-resolution map after downsampling.
Formula (1) describes the mapping relationship of the robot position coordinates between the high- and low-resolution maps,
where $\lfloor \bullet \rfloor$ represents rounding down, $(x_{high},y_{high})$ represents the coordinates of the robot under the high-resolution map, $(x_{low},y_{low})$ represents the coordinates of the robot under the low-resolution map, $(x_{o},y_{o})$ denotes the coordinates of the origin of the high-resolution map, $m$ represents the ratio of the resolution in the x-direction between the high-resolution map and the low-resolution map, and $n$ represents the ratio of the resolution in the y-direction between the high-resolution map and the low-resolution map. After building a two-layer map and mapping the robot position from the high-resolution map to the low-resolution one, the regions are segmented in the low-resolution map.
The novel priority cost map A* (PCM-A*) is used as the region partition method. The robot may avoid collisions with another robot in the same low-resolution grid as each low-resolution grid comprises multiple high-resolution grids. Consequently, when PCM-A* is employed to partition regions, it does not need to impede the robot with lower priority passing through the path of the robot with higher priority. However, to minimize conflicts among the robots, it is essential to decrease the overlap between the paths of each robot. Therefore, after a high-priority robot has a path, it is necessary to increase the cost of the path. This operation causes other robots to expand their paths to nodes with fewer occupancies during path search, thereby reducing overlap between regions. Lines 1 to line 7 of Algorithm 1 shows the flow. The input to the algorithm is the robot set $R$ and a low-resolution cost map $C_m$ . Besides, the initial value of the $C_m$ is 10. The number of robots in the robot set is $R_m$ .
Take Fig 4 as an example, to plan an optimal path from $S_1$ to $G_1$ for robot A, and the result of the planning is assumed to be the yellow grid. Next, the optimal path from $S_2$ to $G_2$ is planned for robot B. If the traditional A* is used, the planning result may be the green path in Fig 4(b); thus, there is a possibility of conflict between robot A and robot B. Whereas when PCM-A* is used to plan the path for robot B, the cost of the yellow grid has been added. In this case, the optimal path for robot B is the grid through which blue or green line segments pass. It is assumed that it takes the same time for the robot to pass through each grid. In this case, no conflict will occur when using PCM-A *, but conflicts will occur when using A*.
After dividing the region for each robot, there may be no solutions if each robot is only allowed to perform low-level path planning within its respective region. Therefore, a probabilistic heuristic method is designed to solve the problem, where each robot has a higher probability of expanding within its designated region while still having the possibility to expand into other regions. The method can ensure completeness, which is analyzed in the section completeness analysis and the maximum of solution. The probability is calculated by the formula (2). The physical meaning of formula (2) is that the more times a region is occupied by robots, the probability that a path node expands to this region when searching is lower.
where $p$ is the output of Algorithm 1, and $w$ is a weight used to calculate the probability $Dp$ . If $w$ is greater, it implies that during the path search process, the probability of path nodes only expanding to the partitioned areas is higher. In scenarios where there is a higher number of robots, this could potentially lead to an inability to find a path solution that meets the constraint conditions in a low-level search. When $w$ is smaller, the region division is meaningless. The algorithm performs better in terms of time consumption when $w$ is between 0.05 and 0.2 during the experiment.
3.2. Two-layer division and probability heuristic conflict search framework
ECBS is currently one of the most popular and high-performing algorithms for efficiently handling conflicts among multiple robots. It achieves this by generating bounded-suboptimal paths with fewer collisions with the paths of the other robots on the low level and expanding bounded-suboptimal CT nodes that contain fewer collisions on the high level [Reference Chan, Li, Gange, Harabor, Stuckey and Koenig27]. In the low-level path planning phase, the A* method is used to find the path for each robot on a known map, which satisfies constraints and avoids collisions with known obstacles. In the high-level conflict detection and constraint generation phase, the paths of all robots are considered and checked for conflicts. For existing conflicts, constraints are added to resolve the conflicts between robots. The low-level path planning is then called until a nonconflicting solution is found. To further improve the algorithm, probabilistic heuristic search is introduced in low-level path planning to reduce the conflicts and improve efficiency when planning the path, which is called Ph-A*. The probability is obtained by the PCM-A*. The Ph-A* algorithm is described as Algorithm 2. The function $CheckConstraints()$ in Line 8 checks whether the node satisfies the constraints of the high-level search build and whether it is occupied by obstacles.
3.3. Analysis of the maximum value of the solution and the probability of obtaining the solution
Theorem 1. If a solution exists, the probability of RH-CBS obtaining a solution is nonzero.
Proof. Consider the path nodes for the final solution of agent $i$ denoted as $N_1, N_2, \ldots, N_m$ . Each node in the high-resolution map is subject to expansion with a specific probability, denoted as $p_1, p_2, \ldots, p_m$ . Thus, the probability of adding node $N_m$ to the $OpenList$ of A* can be expressed as follows:
Since the $p_i \gt 0$ , the $P(N_m) \gt 0$ . In addition, the probability of the method entering the loop is as follows:
The $P_i$ is the probability of one of the paths with conflicts. And $0\lt P_i \lt 1$ , so when the $n \rightarrow \infty$ , the $P_{loop} \rightarrow 0$ . Therefore, RH-ECBS does not get stuck in a loop when solving MAPF problems.
Therefore, if a solution exists, the probability of RH-CBS obtaining a solution is nonzero.
Theorem 2. When each agent only finds a path in its region, traditional A* is used for regional division. The cost of the solution returned by the RH-ECBS is at most $2m/n \cdot C_{min}$ , and the $m$ and $n$ represent the number of the grid in high-resolution and low-resolution maps, respectively. $C_{min}$ represents the cost of the optimal solution.
Proof. The $L_i$ represents the cost of the solution returned by RH-ECBS and the $l_i$ represents the cost of each agent in the optimal solution. Besides, the relationship between $L_i$ and $l_i$ is that $L_i = k_i \cdot l_i$ , where $k_i$ is an indeterminate constant. The cost can be represented as follows:
Next, the $k_{max}$ can be calculated. Assuming that each agent only finds a path in its region. Because the region is returned by A*, the optimal path of each agent must be in the region. In addition, the worst path is to pass through all the grids in the region. Therefore, the $k_{max} = 2m/n$ . So, theorem 2 is proved.
4. Experiments and discussions
In order to verify the effectiveness of the proposed method, we compared RH-ECBS with the open-source method ECBS [Reference Barer, Sharon, Stern and Felner14, Reference Li, Ran and Xie28]. All experiments are performed on an Intel(R) Core(TM) i7-11800H CPU with 2.3 GHz and 16 G of RAM. The experiment scenarios are set as 24 $\times$ 18 resolution maps with no obstacles and random obstacles, respectively.
In order to facilitate a comprehensive comparison, each experimental scenario involves the configuration of four distinct low-resolution maps: 2 $\times$ 2, 4 $\times$ 3, 6 $\times$ 6, and 12 $\times$ 9. Moreover, varying numbers of robots are designated for each resolution namely, 118, 142, 166, 190, and 213 – representing 25%, 30%, 35%, 40%, and 45% of the map size, respectively. There are five different configurations for each number of robots. Given that the proposed method incorporates probability and lacks consistency, each experiment is conducted 20 times. Subsequently, the 10 trials with the smallest discrepancy in solution cost between RH-ECBS and ECBS are chosen. Finally, the average of these 10 trials is computed for comparison with the results obtained using ECBS.
Four distinct low-resolution maps are configured to assess the influence of map resolution on planning outcomes. Specifically, the low-resolution maps are defined as 2 $\times$ 2, 4 $\times$ 3, 6 $\times$ 6, and 12 $\times$ 9, respectively. Comparative experiments are conducted with varying numbers of robots. The average experimental results are presented in Table II and Fig 5. Comparison with ECBS indicates that RH-ECBS exhibits superior efficiency across different robot quantities. Furthermore, the experimental outcomes for the 6 $\times$ 6 low-resolution map are detailed in Table I to more effectively illustrate performance improvements with varying robot numbers.
The results show that compared with ECBS, the number of conflicts solved by the proposed method is reduced by 34.08%, the solving time is reduced by 31.89%, and the cost is only 0.22% worse. It is noted that the percentage reduction refers to the average of the percentage reduction of experiments at each configuration.
It is noteworthy that we also conducted experiments on a low-resolution map of 24 $\times$ 12; however, obtaining a solution within a short timeframe proved unfeasible; hence, it is excluded from the table. Analysis of the results reveals that, with an increase in the resolution of low-resolution maps, the cost of the proposed algorithm’s solution also increases. This degradation is attributed to the heightened randomness of RH-ECBS as the resolution increases. However, the significance of regional division diminishes when the low-resolution map’s resolution is small, explaining the observed minor efficiency improvement in the 2 $\times$ 2 low-resolution map. Taking into account the solution cost and solving time, efficiency sees more improvement with only a slight increase in cost when the low-resolution map has a resolution of 6 $\times$ 6 or 4 $\times$ 3.
Furthermore, five sets of tests with different numbers of robots are conducted in an environment with obstacles in a low-resolution map of 4 $\times$ 3. The map with random obstacles is shown in Fig 6a. The comparison experiment is set up as before, and the results of experiments are presented in Table IV. The results show that compared with ECBS, the solution time of the proposed method is reduced by more than 50% in the environment with obstacles.
In addition, tests are conducted in a warehousing environment, as shown in Fig 6b. In this environment, path planning was tested for 72 robots, 90 robots, and 108 robots, with five different start-target points selected for each quantity. The average results are shown in Table III. It is worth noting that the efficiency improvement refers to the average improvement under each test condition. Furthermore, the low-resolution map has a resolution of 4 $\times$ 3. The results show that compared with ECBS in a warehousing environment, the RH-ECBS algorithm obtains paths more quickly.
5. Conclusion
This paper introduces a MRPP framework incorporating regional heuristics. This method improves the efficiency of solving MRPP problems by reducing the number of conflicts during planning. Besides, compared with the popular ECBS method, the solving time is decreased by more than 30%. Further proving the nonzero probability of obtaining a solution using this method and determining the maximum cost of the solution in certain cases. However, there are still areas for further improvement in the proposed method. For example, although this method improves efficiency, the path cost increases, so it needs to be further optimized in the future. At the same time, the region division framework is considered to be added to other pathfinding methods based on graph search, such as EECBS [Reference Li, Ruml and Koenig15], to further verify the improvement effect of this method.
Author contributions
All authors contributed to the study. Zhangchao Pan was involved in the conceptualization, methodology, software, validation, data curation, formal analysis, investigation, visualization, and writing – original draft preparation; Xuebo Zhang contributed to the conceptualization, writing – review and editing, funding acquisition, resources, project administration, and supervision; Runhua Wang assisted in the methodology, validation; Qingchen Bi was involved in writing – review and editing; and Jingjin Yu contributed to writing – review and editing.
Financial support
This work was supported in part by the National Natural Science Foundation of China under Grant 62293513/62293510, 62303247, and in part by the Natural Science Foundation of Tianjin under Grant 22JCZDJC00810, and in part by the China Postdoctoral Science Foundation under Grant 2020M670628, and in part by the Fundamental Research Funds for the China Central Universities of Nankai University.
Competing interests
The authors declare no competing interests exist.
Ethical approval
None.