Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-28T21:22:50.716Z Has data issue: false hasContentIssue false

Reinforcement learning-based radar-evasive path planning: a comparative analysis

Published online by Cambridge University Press:  26 October 2021

R.U. Hameed
Affiliation:
Research Centre for Modeling & Simulation, National University of Sciences & Technology, Islamabad, Pakistan
A. Maqsood
Affiliation:
Research Centre for Modeling & Simulation, National University of Sciences & Technology, Islamabad, Pakistan
A.J. Hashmi
Affiliation:
Research Centre for Modeling & Simulation, National University of Sciences & Technology, Islamabad, Pakistan
M.T. Saeed
Affiliation:
Research Centre for Modeling & Simulation, National University of Sciences & Technology, Islamabad, Pakistan
R. Riaz
Affiliation:
Research Centre for Modeling & Simulation, National University of Sciences & Technology, Islamabad, Pakistan

Abstract

This paper discusses the utilisation of deep reinforcement learning algorithms to obtain optimal paths for an aircraft to avoid or minimise radar detection and tracking. A modular approach is adopted to formulate the problem, including the aircraft kinematics model, aircraft radar cross-section model and radar tracking model. A virtual environment is designed for single and multiple radar cases to obtain optimal paths. The optimal trajectories are generated through deep reinforcement learning in this study. Specifically, three algorithms, namely deep deterministic policy gradient, trust region policy optimisation and proximal policy optimisation, are used to find optimal paths for five test cases. The comparison is carried out based on six performance indicators. The investigation proves the importance of these reinforcement learning algorithms in optimal path planning. The results indicate that the proximal policy optimisation approach performed better for optimal paths in general.

Type
Research Article
Copyright
© The Author(s), 2021. Published by Cambridge University Press on behalf of Royal Aeronautical Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Raivio, T., Ehtamo, H. and Hämäläinen, R.P. Aircraft trajectory optimization using nonlinear programming, In Doležal and Fidler (Eds), System Modelling and Optimization, IFIP — The International Federation for Information Processing, Springer, Boston, MA, 1996. https://doi.org/10.1007/978-0-387-34897-1_52.CrossRefGoogle Scholar
Betts, J.T. Survey of numerical methods for trajectory optimization, J. Guidance Control Dyn., 1998, 21, (2), pp 193207.CrossRefGoogle Scholar
Judd, K.B. and McLain, T.W. Spline based path planning for unmanned air vehicles, AIAA Guidance, Navigation, and Control Conference and Exhibit, 2001.CrossRefGoogle Scholar
Maqsood, A. and Go, T.H. Optimization of transition maneuvers through aerodynamic vectoring, Aerosp. Sci. Technol., 2012, 23, (1), pp 363–371. 35th ERF: Progress in Rotorcraft Research.CrossRefGoogle Scholar
Feroskhan, M. and Go, T.H. Control strategy of sideslip perching maneuver under dynamic stall influence, Aerosp. Sci. Technol., 2018, 72, pp 150163.CrossRefGoogle Scholar
Mir, I., Maqsood, A., Eisa, S.A., Taha, H. and Akhtar, S. Optimal morphing – augmented dynamic soaring maneuvers for unmanned air vehicle capable of span and sweep morphologies, Aerosp. Sci. Technol., 2018, 79, pp 1736.CrossRefGoogle Scholar
Aggarwal, S. and Kumar, N. Path planning techniques for unmanned aerial vehicles: a review, solutions, and challenges, Comput. Commun., 2020, 149, pp 270299.CrossRefGoogle Scholar
Pang, Y., Li, Y., Wang, J., Yan, M., Chen, H., Sun, L., Xu, Z. and Qu, S. Carbon fiber assisted glass fabric composite materials for broadband radar cross section reduction, Compos. Sci. Technol., 2018, 158, pp 1925.CrossRefGoogle Scholar
Baek, S., Lee, W. and Joo, Y. A study on a radar absorbing structure for aircraft leading edge application, Int. J. Aeronaut. Space Sci., 2017, 18, pp 215221.CrossRefGoogle Scholar
Zainud-Deen, S.H., Malhat, H.A.E.A. and Shabayek, N.A. Reconfigurable RCS reduction from curved structures using plasma based FSS, Plasmonics, 2020, 15, pp 341–350. https://doi.org/10.1007/s11468-019-01048-y.Google Scholar
Lingxiao, W. and Deyun, Z. Effective path planning method for low detectable aircraft, J. Syst. Eng. Electr., 2009, 20, (4), pp 784789.Google Scholar
Pachter, M. and Hebert, J. Optimal aircraft trajectories for radar exposure minimization, Proceedings of the American Control Conference, vol. 3, 2001, pp 2365–2369.CrossRefGoogle Scholar
Moore, F.W. Radar cross-section reduction via route planning and intelligent control, IEEE Trans. Syst, Control . Technol., 2002, 10, (5), pp 696–700.CrossRefGoogle Scholar
Norsell, M. Aircraft trajectories considering radar range constraints, Aerosp. Sci. Technol., 2002, 6, (1), pp 8389.CrossRefGoogle Scholar
Kabamba, P.T., Meerkov, S.M. and Zeitz, F.H. Optimal UCAV Path Planning Under Missile Threats, vol. 16. IFAC, 2005.Google Scholar
Kabamba, P.T., Meerkov, S.M. and Zeitz, F.H. Optimal path planning for unmanned combat aerial vehicles to defeat radar tracking, J. Guid. Control Dyn., 2006, 29, (2), pp 279288.CrossRefGoogle Scholar
Sutton, R. and Barto, A. Reinforcement Learning, MIT Press, 2018.Google Scholar
Yan, C., Xiang, X. and Wang, C. Fixed-wing UAVs flocking in continuous spaces: a deep reinforcement learning approach, Robot. Auton. Syst., 2020, 131, p 103594.CrossRefGoogle Scholar
Ma, Z., Wang, C., Niu, Y., Wang, X. and Shen, L. A saliency-based reinforcement learning approach for a UAV to avoid flying obstacles, Robot. Auton. Syst., 2018, 100, pp 108118.CrossRefGoogle Scholar
Liu, Y., Liu, H., Tian, Y. and Sun, C. Reinforcement learning based two-level control framework of UAV swarm for cooperative persistent surveillance in an unknown urban area, Aerosp. Sci. Technol., 2020, 98, p 105671.CrossRefGoogle Scholar
Carlucho, I., De Paula, M., Wang, S., Petillot, Y. and Acosta, G.G. Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning, Rob. Auton. Syst., 2018, 107, pp 7186.CrossRefGoogle Scholar
You, C., Lu, J., Filev, D. and Tsiotras, P. Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning, Rob. Auton. Syst., 2019, 114, pp 118.CrossRefGoogle Scholar
Carrio, A., Sampedro, C., Rodriguez-Ramos, A. and Campoy, P. A review of deep learning methods and applications for unmanned aerial vehicles, J. Sens., 2017, 2017, pp 113.CrossRefGoogle Scholar
Dos Santos, Ś.R.B., Nascimento, C.L. and Givigi, S.N. Design of attitude and path tracking controllers for quad-rotor robots using reinforcement learning, IEEE Aerospace Conference Proceedings, 2012, pp 1–16.CrossRefGoogle Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D. and Riedmiller, M. Deterministic policy gradient algorithms, 31st International Conference on Machine Learning, ICML 2014, vol. 1, 2014, pp 605619.Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L. and van den Driessche1, G. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, 529, (7587), pp 484–489.CrossRefGoogle Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. and Wierstra, D. Continuous control with deep reinforcement learning, ICLR 2016, 2016, p 14.Google Scholar
Wang, C., Wang, J., Shen, Y. and Zhang, X. Autonomous navigation of UAVs in large-scale complex environments: a deep reinforcement learning approach, IEEE Trans. Veh. Technol., 2019, 68, (3), pp 2124–2136.CrossRefGoogle Scholar
Schulman, J., Levine, S., Moritz, P., Jordan, M. and Abbeel, P. Trust region policy optimization, 32nd International Conference on Machine Learning, ICML 2015, vol. 3, 2015, pp 18891897.Google Scholar
Hwangbo, J., Sa, I., Siegwart, R. and Hutter, M. Control of a quadrotor with reinforcement learning, IEEE Rob. Autom. Lett., 2017, 2, (4), pp 20962103.CrossRefGoogle Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. and Klimov, O. Proximal policy optimization algorithms, CoRR, abs/1707.06347, 2017.Google Scholar
Heess, N., TB, D., Sriram, S., Lemmon, J., Merel, J., Wayne, G., Tassa, Y., Erez, T., Wang, Z., Ali Eslami, S.M., Riedmiller, M.A. and Silver, D. Emergence of locomotion behaviours in rich environments, CoRR, abs/1707.02286, 2017.Google Scholar
Lopes, G.C., Ferreira, M., Da Silva Simoes, A. and Colombini, E.L. Intelligent control of a quadrotor with proximal policy optimization reinforcement learning, Proceedings - 15th Latin American Robotics Symposium, 6th Brazilian Robotics Symposium and 9th Workshop on Robotics in Education, LARS/SBR/WRE 2018, 2018, pp 509–514.Google Scholar
Bohn, E., Coates, E.M., Moe, S, and Johansen, T.A. Deep reinforcement learning attitude control of fixed-wing UAVs using proximal policy optimization, 2019 International Conference on Unmanned Aircraft Systems, ICUAS 2019, 2019, pp 523–533.CrossRefGoogle Scholar
Koch, W., Mancuso, R., West, R. and Bestavros, A. Reinforcement learning for UAV attitude control, ACM Trans. Phys, Cyber . Syst., 2019, 3, (2), pp 1–13.CrossRefGoogle Scholar
Zhang, Z., Wu, J., Dai, J. and He, C. A novel real-time penetration path planning algorithm for stealth UAV in 3D complex dynamic environment, IEEE Access, 2020, 8, pp 122757122771.CrossRefGoogle Scholar
Zhao, Z., Niu, Y., Ma, Z. and Ji, X. A fast stealth trajectory planning algorithm for stealth uav to fly in multi-radar network, 2016 IEEE International Conference on Real-time Computing and Robotics (RCAR), 2016, pp 549554.CrossRefGoogle Scholar
Richter, R. and Gomes, N.A.S. A-4 Skyhawk aircraft stealth capacity against L-band radar based on dynamic target detection, 2020 IEEE Radar Conference (RadarConf20), 2020, pp 1–5.CrossRefGoogle Scholar
Supplementary material: File

Hameed et al. supplementary material

Hameed et al. supplementary material

Download Hameed et al. supplementary material(File)
File 23.8 KB