Hostname: page-component-7dd5485656-k8lnt Total loading time: 0 Render date: 2025-10-26T21:24:09.964Z Has data issue: false hasContentIssue false

Robustifying a reinforcement learning agent-based bionic reflex controller through an adaptive sliding mode control

Published online by Cambridge University Press:  08 November 2024

Hirakjyoti Basumatary*
Affiliation:
Biomimetic Robotics and Artificial Intelligence Laboratory (BRAIL), Mechanical Engineering Department, Indian Institute of Technology, Guwahati, India
Daksh Adhar
Affiliation:
Biomimetic Robotics and Artificial Intelligence Laboratory (BRAIL), Mechanical Engineering Department, Indian Institute of Technology, Guwahati, India
Shyamanta M. Hazarika
Affiliation:
Biomimetic Robotics and Artificial Intelligence Laboratory (BRAIL), Mechanical Engineering Department, Indian Institute of Technology, Guwahati, India
*
Corresponding author: Hirakjyoti Basumatary; Email: 23hirak@gmail.com

Abstract

Maintaining object grasp stability represents a pivotal challenge within the domain of robotic manipulation and upper-limb prosthetics. Perturbations originating from external sources frequently disrupt the stability of grasps, resulting in slippage occurrences. Also, if the grasping forces are not optimal while controlling the slip, it may result in the deformation of the objects. This study investigates the robustification of a reinforcement learning (RL) policy for implementing intelligent bionic reflex control, i.e., slip and deformation prevention of the grasped objects. RL-derived policies are vulnerable to failures in environments characterized by dynamic variability. To mitigate this vulnerability, we propose a methodology involving the incorporation of an adaptive sliding mode controller into a pre-trained RL policy. By exploiting the inherent invariance property of the sliding mode algorithm in the presence of uncertainties, our approach strengthens the robustness of the RL policies against diverse and dynamic variations. Numerical simulations substantiate the efficacy of our approach in robustifying RL policies trained within simulated environments.

Information

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Sanchez, J., Corrales, J.-A., Bouzgarrou, B.-C. and Mezouar, Y., “Robotic manipulation and sensing of deformable objects in domestic and industrial applications: A survey,” Int. J. Robot. Res. 37(7), 688716 (2018).CrossRefGoogle Scholar
Basumatary, H. and Hazarika, S. M., “State of the art in bionic hands,” IEEE T. Hum-MACH. Syst. 50(2), 116130 (2020).CrossRefGoogle Scholar
Zhu, J., Cherubini, A., Dune, C., Navarro-Alarcon, D., Alambeigi, F., Berenson, D., Ficuciello, F., Harada, K., Kober, J., Xiang, L., “Challenges and outlook in robotic manipulation of deformable objects,” IEEE Robot. Autom. Mag. 29(3), 6777 (2022).CrossRefGoogle Scholar
Romeo, R. A. and Zollo, L., “Methods and sensors for slip detection in robotics: A survey,” IEEE Access 8, 7302773050 (2020).CrossRefGoogle Scholar
Romeo, R. A., Lauretti, C., Gentile, C., Guglielmelli, E. and Zollo, L., “Method for automatic slippage detection with tactile sensors embedded in prosthetic hands,” IEEE T. Med. Robot. Bion. 3(2), 485497 (2021).CrossRefGoogle Scholar
Cheng, Y., Zhao, P., Wang, F., Block, D. J. and Hovakimyan, N., “Improving the robustness of rreinforcement learning olicies with l1 adaptive control,” IEEE Robot. Auto. Lett. 7(3), 65746581 (2022).CrossRefGoogle Scholar
James, J. W. and Lepora, N. F., “Slip detection for grasp stabilization with a multifingered tactile robot hand,” IEEE T. Robot. 37(2), 506519 (2020).CrossRefGoogle Scholar
Yang, D. and Wu, G., “A multi-threshold-based force regulation policy for prosthetic hand preventing slippage,” IEEE Access 9, 96009609 (2021).CrossRefGoogle Scholar
Nazari, K. and Mandil, W., “roactive slip control by learned slip model and trajectory adaptation,” (2022). arXiv preprint arXiv: 2209.06019.Google Scholar
Siciliano, B., Sciavicco, L., Villani, L. and Oriolo, G.. Force Control (Springer, 2009).Google Scholar
Carbone, G., Iannone, S. and Ceccarelli, M., “Regulation and control of LARM Hand III,” Robot. Comp-INT. Manuf. 26(2), 202211 (2010).CrossRefGoogle Scholar
Engeberg, E. D. and Meek, S. G., “Adaptive sliding mode control for prosthetic hands to simultaneously prevent slip and minimize deformation of grasped objects,” IEEE/ASME T. Mechtron. 18(1), 376385 (2011).CrossRefGoogle Scholar
Zhang, Y., Xu, X., Xia, R. and Deng, H., “Stiffness-estimation-based grasping force fuzzy control for underactuated prosthetic hands,” IEEE/ASME T. Mechatron. 28(1), 140–151 (2022).Google Scholar
Cretu, A.-M., Payeur, P. and Petriu, E. M., “Soft object deformation monitoring and learning for model-based robotic hand manipulation,” IEEE T. Syst. Man Cybern. Part B (Cybernetics) 42(3), 740753 (2011).CrossRefGoogle ScholarPubMed
Makihara, K., Domae, Y., Ramirez-Alpizar, I. G., Ueshiba, T. and Harada, K., “Grasp pose detection for deformable daily items by pix2stiffness estimation,” Adv. Robot. 36(12), 600610 (2022).CrossRefGoogle Scholar
Shen, B., Jiang, Z., Choy, C., Guibas, L. J., Savarese, S., Anandkumar, A. and Zhu, Y., “Acid: Action-conditional implicit visual dynamics for deformable object manipulation,” (2022). arXiv preprint arXiv: 2203.06856.Google Scholar
Ji, W., Zhang, J., Xu, B., Tang, C. and Zhao, D., “Grasping mode analysis and adaptive impedance control for apple harvesting robotic grippers,” Comput. Electron. Agr. 186, 106210 (2021).CrossRefGoogle Scholar
Duan, X.-G., Zhang, Y. and Deng, H., “A simple control method to avoid overshoot for prosthetic hand control,” In 2014 IEEE International Conference on Information and Automation (ICIA), IEEE (2014) pp. 736739.Google Scholar
Jiang, L., Tian, X., Zhan, Q., Xu, Q. and Zhang, Y., “Impedance control of an anthropomorphic hands without finger force sensors,” IEEE T. Autom. Sci. Eng. 21(4), 5779–5789 (2023).Google Scholar
Deng, H., Zhong, G., Li, X. and Nie, W., “Slippage and deformation preventive control of bionic prosthetic hands,” IEEE/ASME T. Mechatron. 22(2), 888897 (2016).CrossRefGoogle Scholar
Kaboli, M., Yao, K. and Cheng, G., “Tactile-based manipulation of deformable objects with dynamic center of mass,” In 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), IEEE (2016) pp. 752757.Google Scholar
Mouaze, N. and Birglen, L., “Bistable compliant underactuated gripper for the gentle grasp of soft objects,” Mech. Mach. Theory 170, 104676 (2022).CrossRefGoogle Scholar
Wang, W. and Ahn, S.-H., “Shape memory alloy-based soft gripper with variable stiffness for compliant and effective grasping,” Soft Robot. 4(4), 379389 (2017).CrossRefGoogle ScholarPubMed
Milojević, A., Linß, S., Ćojbašić, Žarko and Handroos, H., “A novel simple, adaptive, and versatile soft-robotic compliant two-finger gripper with an inherently gentle touch,” J. Mech. Robot. 13(1), 011015 (2021).CrossRefGoogle Scholar
Salvato, E., Fenu, G., Medvet, E. and Pellegrino, F. A., “Crossing the reality gap: A survey on sim-to-real transferability of robot controllers in reinforcement learning,” IEEE Access 9, 153171153187 (2021).CrossRefGoogle Scholar
Güitta-López, L.ía, Boal, J. and lvaro J López-López, Á., “Learning more with the same effort: How randomization improves the robustness of a robotic deep rreinforcement learning gent,” Appl. Intell. 53(12), 1490314917 (2023).CrossRefGoogle Scholar
Chen, X., Hu, J., Jin, C., Li, L. and Wang, L., “Understanding domain randomization for sim-to-real transfer,” (2021). arXiv preprint arXiv: 2110.03239.Google Scholar
Pinto, L., Davidson, J., Sukthankar, R. and Gupta, A., “Robust adversarial reinforcement learning,” In International Conference on Machine Learning, PMLR (2017) pp. 28172826 Google Scholar
Morimoto, J. and Doya, K., “Robust reinforcement learning,” Neural Comput. 17(2), 335359 (2005).CrossRefGoogle ScholarPubMed
Rice, L., Wong, E. and Kolter, Z., “Overfitting in adversarially robust deep learning,” In International Conference on Machine Learning, PMLR (2020) pp. 80938104.Google Scholar
Nagabandi, A., Clavera, I., Liu, S., Fearing, R. S., Abbeel, P., Levine, S. and Finn, C., “Learning to adapt in dynamic, real-world environments through meta-reinforcement learning,” (2018). arXiv preprint arXiv: 1803.11347.Google Scholar
Rusu, A. A., Colmenarejo, S. G., Gulcehre, C., Desjardins, G., Kirkpatrick, J., Pascanu, R., Mnih, V., Kavukcuoglu, K. and Hadsell, R., “Policy distillation,” (2015). arXiv preprint arXiv: 1511.06295.Google Scholar
Kadokawa, Y., Zhu, L., Tsurumine, Y. and Matsubara, T., “Cyclic policy distillation: Sample-efficient sim-to-real rreinforcement learning ith domain randomization,” Robot. Auton. Syst. 165, 104425 (2023).CrossRefGoogle Scholar
Niu, Z., Yuan, J., Ma, X., Xu, Y., Liu, J., Chen, Y.-W., Tong, R. and Lin, L., “Knowledge distillation-based domain-invariant representation learning for domain generalization,” IEEE T. Multimedia, (2023).Google Scholar
Kim, J. W., Shim, H. and Yang, I., “On improving the robustness of reinforcement learning-based controllers using disturbance observer,” In 2019 IEEE 58th Conference on Decision and Control (CDC), IEEE (2019) pp. 847852.Google Scholar
Guha, A. and Annaswamy, A., “Mrac-rl: A framework for on-line policy adaptation under parametric model uncertainty,” (2020) arXiv preprint arXiv: 2011.10562.Google Scholar
Hao, S., Hu, L. and Liu, P. X., “Second-order adaptive integral terminal sliding mode approach to tracking control of robotic manipulators,” IET Control Theory A. 15(17), 21452157 (2021).CrossRefGoogle Scholar
Coumans, E. and Bai, Y.Pybullet, a python module for physics simulation for games, robotics and machine learning,” (2016). (https://pybullet.org/wordpress/).Google Scholar
Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A. and Abeel, P., “Soft actor-critic algorithms and applications,” (2018), arXiv preprint arXiv: 1812.05905.Google Scholar
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M. and Dormann, N., “Stable-baselines3: Reliable rreinforcement learning mplementations,” J. Mach. Learn. Res. 22(268), 18 (2021).Google Scholar
Deng, H., Zhang, Y. and Duan, X.-G., “Wavelet transformation-based fuzzy reflex control for prosthetic hands to prevent slip,” IEEE T. Ind. Electron. 64(5), 37183726 (2016).CrossRefGoogle Scholar
Yang, H., Hu, X., Cao, L. and Sun, F., “A new slip-detection method based on pairwise high frequency components of capacitive sensor signals,” In 2015 5th International Conference on Information Science and Technology (ICIST), IEEE (2015) pp. 5661.Google Scholar
Romeo, R. A., Rongala, U. B., Mazzoni, A., Camboni, D., Carrozza, M. C., Guglielmelli, E., Zollo, L. and Oddo, C. M., “Identification of slippage on naturalistic surfaces via wavelet transform of tactile signals,” IEEE Sens. J. 19(4), 12601268 (2018).CrossRefGoogle Scholar
Hu, Y., Schneider, T., Wang, B., Zorin, D. and Panozzo, D., “Fast tetrahedral meshing in the wild,” ACM T. Graphics (TOG) 39(4), 117–111 (2020).Google Scholar
Arriola-Rios, V. E., Guler, P., Ficuciello, F., Kragic, D., Siciliano, B. and Wyatt, J. L., “Modeling of deformable objects for robotic manipulation: A tutorial and review,” Front. Robot. AI 7, 82 (2020).CrossRefGoogle ScholarPubMed
Zhang, C. and Chen, T., “Efficient Feature Extraction for 2d/3d Objects in Mesh Representation,” In: Proceedings 2001 International Conference On Image Processing (Cat No. 01CH37205), Vol. 3, (IEEE, 2001) pp. 935938.CrossRefGoogle Scholar
Ma, X., Chen, L., Gao, Y., Liu, D. and Wang, B., “Modeling contact stiffness of soft fingertips for grasping applications,” Biomimetics 8(5), 398 (2023).CrossRefGoogle ScholarPubMed
Utkin, V. and Shi, J., “Integral sliding mode in systems operating under uncertainty conditions,” In Proceedings of 35th IEEE conference on decision and control, Vol. 4, IEEE, (1996) pp. 45914596.Google Scholar
Li, P., Ma, J., Zheng, Z. and Geng, L., “Fast nonsingular integral terminal sliding mode control for nonlinear dynamical systems,” In 53rd IEEE conference on decision and control, IEEE (2014) pp. 47394746.Google Scholar
Alattas, K. A., Mobayen, S., Sami, U. D., Jihad, H. A., Afef, Fekih, Wudhichai, A. and Mai, T. V., “Design of a non-singular adaptive integral-type finite time tracking control for nonlinear systems with external disturbances,” IEEE Access 9, 102091102103 (2021).CrossRefGoogle Scholar
Mondal, S. and Mahanta, C., “Adaptive second order terminal sliding mode controller for robotic manipulators,” J. Frankl. Inst. 351(4), 23562377 (2014).CrossRefGoogle Scholar
Boukattaya, M., Mezghani, N. and Damak, T., “Adaptive nonsingular fast terminal sliding-mode control for the tracking problem of uncertain dynamical systems,” ISA T. 77, 119 (2018).CrossRefGoogle ScholarPubMed
Al-Mohammed, M., Adem, R. and Behal, A., “A switched adaptive controller for robotic gripping of novel objects with minimal force,” IEEE T. Contr. Syst. T. 31(1), 1726 (2022).CrossRefGoogle Scholar
Fakhari, A., Kao, I. and Keshmiri, M., “Modeling and control of planar slippage in object manipulation using robotic soft fingers,” ROBOMECH. J. 6(1), 15 (2019).CrossRefGoogle Scholar
Fakhari, A., Keshmiri, M., Kao, I. and Jazi, S. H., “Slippage control in soft finger grasping and manipulation,” Adv. Robotics 30(2), 97108 (2016).CrossRefGoogle Scholar
Logothetis, M., Karras, G. C., Alevizos, K. and Kyriakopoulos, K. J., “A variable impedance control strategy for object manipulation considering non–rigid grasp,” In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE (2020) pp. 74117416.Google Scholar
Collins, J., Chand, S., Vanderkop, A. and Howard, D., “A review of physics simulators for robotic applications,” IEEE Access 9, 5141651431 (2021).CrossRefGoogle Scholar
Muratore, F., Ramos, F., Turk, G., Yu, W., Gienger, M. and Peters, J., “Robot learning from randomized simulations: A review,” Front. Robot. AI 31, (2022).Google ScholarPubMed
Chen, C.-H. and Naidu, D. S., “Fusion of Hard and Soft Control Strategies for the Robotic Hand,” (John Wiley & Sons, 2017).CrossRefGoogle Scholar