Hostname: page-component-5b777bbd6c-w9n4q Total loading time: 0 Render date: 2025-06-18T19:15:28.055Z Has data issue: false hasContentIssue false

Tightly-coupled visual-inertial odometry with robust feature association in dynamic illumination environments

Published online by Cambridge University Press:  04 June 2025

Jie Zhang
Affiliation:
Institute of Advanced Technology, University of Science and Technology of China, Hefei 230031, China
Cong Zhang
Affiliation:
Department of Automation, University of Science and Technology of China, Hefei 230027, China
Qingchen Liu
Affiliation:
Department of Automation, University of Science and Technology of China, Hefei 230027, China
Qichao Ma
Affiliation:
Department of Automation, University of Science and Technology of China, Hefei 230027, China
Jiahu Qin*
Affiliation:
Department of Automation, University of Science and Technology of China, Hefei 230027, China
*
Corresponding author: Jiahu Qin; Email: jhqin@ustc.edu.cn

Abstract

This paper focuses on the feature-based visual-inertial odometry (VIO) in dynamic illumination environments. While the performance of most existing feature-based VIO methods is degraded by the dynamic illumination, which leads to unstable feature association, we propose a tightly-coupled VIO algorithm termed RAFT-VINS, integrating a Lite-RAFT tracker into the visual inertial navigation system (VINS). The key module of this odometry algorithm is a lightweight optical flow network designed for accurate feature tracking with real-time operation. It guarantees robust feature association in dynamic illumination environments and thereby ensures the performance of the odometry. Besides, to further improve the accuracy of the pose estimation, a moving consistency check strategy is developed in RAFT-VINS to identify and remove the outlier feature points. Meanwhile, a tightly-coupled optimization-based framework is employed to fuse IMU and visual measurements in the sliding window for efficient and accurate pose estimation. Through comprehensive experiments in the public datasets and real-world scenarios, the proposed RAFT-VINS is validated for its capacity to provide trustable pose estimates in challenging dynamic illumination environments. Our codes are open-sourced on https://github.com/USTC-AIS-Lab/RAFT-VINS.

Type
Research Article
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Forster, C., Pizzoli, M. and Scaramuzza, D., “SVO: Fast Semi-Direct Monocular Visual Odometry,” 2014 IEEE International Conference on Robotics and Automation (ICRA) (2014) pp. 1522.Google Scholar
Mur-Artal, R., Montiel, J. M. M. and Tardos, J. D., “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans. Robot. 31(5), 11471163 (2015).CrossRefGoogle Scholar
Engel, J., Koltun, V. and Cremers, D., “Direct sparse odometry,” IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611625 (2017).10.1109/TPAMI.2017.2658577CrossRefGoogle ScholarPubMed
Leutenegger, S., Lynen, S., Bosse, M., Siegwart, R. and Furgale, P., “Keyframe-based visual–inertial odometry using nonlinear optimization,” Int. J. Robot. Res. 34(3), 314334 (2015).CrossRefGoogle Scholar
Qin, T., Li, P. and Shen, S., “VINS-Mono: A robust and versatile monocular visual-inertial state estimator,” IEEE Trans. Robot. 34(4), 10041020 (2018).CrossRefGoogle Scholar
Yu, L., Qin, J., Wang, S., Wang, Y. and Wang, S., “A tightly coupled feature-based visual-inertial odometry with stereo cameras,” IEEE Trans. Ind. Electron. 70(4), 39443954 (2022).CrossRefGoogle Scholar
Yu, X., Zheng, W. and Ou, L., “CPR-SLAM: RGB-D SLAM in dynamic environment using sub-point cloud correlations,” Robotica 42(7), 23672387 (2024).CrossRefGoogle Scholar
Ding, W., Pei, Z., Yang, T. and Chen, T., “Dynamic simultaneous localization and mapping based on object tracking in occluded environment,” Robotica 42(7), 22092225 (2024).CrossRefGoogle Scholar
Zhang, K., Dong, C., Guo, H., Ye, Q., Gao, L., Xiang, S., Chen, X. and Wu, Y., “A semantic visual SLAM based on improved mask R-CNN in dynamic environment,” Robotica 42(10), 122 (2024).10.1017/S0263574724001553CrossRefGoogle Scholar
Zuñiga-Noël, D., Jaenal, A., Gomez-Ojeda, R. and Gonzalez-Jimenez, J., “The UMA-VI dataset: Visual–inertial odometry in low-textured and dynamic illumination environments,” Int. J. Robot. Res. 39(9), 10521060 (2020).10.1177/0278364920938439CrossRefGoogle Scholar
Horn, B. K. and Schunck, B. G., “Determining optical flow,” Artif. Intell. 17(1-3), 185203 (1981).10.1016/0004-3702(81)90024-2CrossRefGoogle Scholar
Lucas, B. D. and Kanade, T., “An Iterative Image Registration Technique with an Application to Stereo Vision,” IJCAI’81: 7th International Joint Conference on Artificial Intelligence, vol. 2 (1981) pp. 674679.Google Scholar
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D. and Brox, T., “FlowNet: Learning Optical Flow with Convolutional Networks,” Proceedings of the IEEE International Conference on Computer Vision (2015) pp. 27582766.Google Scholar
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A. and Brox, T., “FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017) pp. 24622470.Google Scholar
Ranjan, A. and Black, M. J., “Optical Flow Estimation Using a Spatial Pyramid Network,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017) pp. 41614170.Google Scholar
Sun, D., Yang, X., Liu, M.-Y. and Kautz, J., “PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018) pp. 89348943.Google Scholar
Teed, Z. and Deng, J., “RAFT: Recurrent All-Pairs Field Transforms for Optical Flow,” European Conferenceon Computer Vision (2020) pp. 402419.Google Scholar
Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A. and Moreno-Noguer, F., “PL-SLAM: Real-Time Monocular Visual SLAM with Points and Lines,” 2017 IEEE International Conference on Robotics and Automation (ICRA) (2017) pp. 45034508.Google Scholar
Fu, Q., Wang, J., Yu, H., Ali, I., Guo, F., He, Y. and Zhang, H., PL-VINS: Real-time monocular visual-inertial SLAM with point and line features. arXiv preprint arXiv: 2009.07462 (2020)Google Scholar
Xu, K., Hao, Y., Yuan, S., Wang, C. and Xie, L., “AirVO: An Illumination-Robust Point-Line Visual Odometry,” 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2023) pp. 34293436.Google Scholar
Shen, S., Michael, N. and Kumar, V., “Tightly-Coupled Monocular Visual-Inertial Fusion for Autonomous Flight of Rotorcraft MAVs,” 2015 IEEE International Conference on Robotics and Automation (ICRA) (2015) pp. 53035310.Google Scholar
Jiang, J., Chen, X., Dai, W., Gao, Z. and Zhang, Y., “Thermal-inertial SLAM for the environments with challenging illumination,” IEEE Robot. Autom. Lett. 7(4), 87678774 (2022).10.1109/LRA.2022.3185385CrossRefGoogle Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A. and Zagoruyko, S., “End-to-End Object Detection with Transformers,” European Conference on Computer Vision (2020) pp. 213229.Google Scholar
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A. and Brox, T., “A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016) pp. 40404048.Google Scholar
Butler, D. J., Wulff, J., Stanley, G. B. and Black, M. J., “A Naturalistic Open Source Movie for Optical Flow Evaluation,” European Conferenceon Computer Vision (2012) pp. 611625.Google Scholar
Shi, J. and Tomasi, C., “Good Features to Track,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (1994) pp. 593600.Google Scholar
Huber, P. J., “Robust Estimation of a Location Parameter,” In: Breakthroughs in Statistics: Methodology and Distribution (Springer, 1992) pp. 492518.CrossRefGoogle Scholar
Agarwal, S. , Mierle, K. and The Ceres Solver Team, “Ceres Solver,” (2023).Google Scholar
Xu, H., Zhang, J., Cai, J., Rezatofighi, H. and Tao, D., “GMFlow: Learning Optical Flow via Global Matching,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022) pp. 81218130.Google Scholar
Shi, X., Li, D., Zhao, P., Tian, Q., Tian, Y., Long, Q., Zhu, C., Song, J., Qiao, F., Song, L., Guo, Y., Wang, Z., Zhang, Y., Qin, B., Yang, W., Wang, F., Chan, R. H. M. and She, Q., “Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM,” 2020 IEEE International Conference on Robotics and Automation (ICRA) (2020) pp. 31393145.Google Scholar
Campos, C., Elvira, R., Rodríguez, J. J. G., Montiel, J. M. and Tardós, J. D., “ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multimap SLAM,” IEEE Trans. Robot. 37(6), 18741890 (2021).CrossRefGoogle Scholar
Teed, Z. and Deng, J., “DROID-SLAM: Deep visual SLAM for Monocular, Stereo, and RGB-D Cameras,” Advances in Neural Information Processing Systems 34 (2021) pp.1655816569.Google Scholar
Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., Achtelik, M. W. and Siegwart, R., “The EuRoC micro aerial vehicle datasets,” Int. J. Robot. Res. 35(10), 11571163 (2016).CrossRefGoogle Scholar
Supplementary material: File

Zhang et al. supplementary material

Zhang et al. supplementary material
Download Zhang et al. supplementary material(File)
File 7.4 MB