Hostname: page-component-5b777bbd6c-cp4x8 Total loading time: 0 Render date: 2025-06-20T07:15:30.782Z Has data issue: false hasContentIssue false

REFNet: reparameterized feature enhancement and fusion network for underwater blur target recognition

Published online by Cambridge University Press:  16 May 2025

Benshun Li
Affiliation:
School of Mechanical and Electrical Engineering, Henan University of Technology, Zhengzhou, P.R. China
Lei Cai*
Affiliation:
School of Artificial Intelligence, Henan Institute of Science and Technology, Xinxiang, P.R. China
*
Corresponding author: Lei Cai; Email: cailei2014@126.com

Abstract

The underwater target detection is affected by image blurring caused by suspended particles in water bodies and light scattering effects. To tackle this issue, this paper proposes a reparameterized feature enhancement and fusion network for underwater blur object recognition (REFNet). First, this paper proposes the reparameterized feature enhancement and gathering (REG) module, which is designed to enhance the performance of the backbone network. This module integrates the concepts of reparameterization and global response normalization to enhance the network’s feature extraction capabilities, addressing the challenge of feature extraction posed by image blurriness. Next, this paper proposes the cross-channel information fusion (CIF) module to enhance the neck network. This module combines detailed information from shallow features with semantic information from deeper layers, mitigating the loss of image detail caused by blurring. Additionally, this paper replace the CIoU loss function with the Shape-IoU loss function improves target localization accuracy, addressing the difficulty in accurately locating bounding boxes in blurry images. Experimental results indicate that REFNet achieves superior performance compared to state-of-the-art methods, as evidenced by higher mAP scores on the underwater robot professional competitionand detection underwater objects datasets. REFNet surpasses YOLOv8 by approximately 1.5% in $mAP_{50:95}$ on the URPC dataset and by about 1.3% on the DUO dataset. This enhancement is achieved without significantly increasing the model’s parameters or computational load. This approach enhances the precision of target detection in challenging underwater environments.

Type
Research Article
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Ismail, A., Mehri, M. and Sahbani, A., “A dual-stage system for real-time license plate detection and recognition on mobile security robots,” Robotica 1(1), 122 (2025).CrossRefGoogle Scholar
Xu, S., Zhang, M., Song, W., Mei, H. and He, Q., “A systematic review and analysis of deep learning-based underwater object detection,” Neurocomputing 527(12), 204232 (2023).CrossRefGoogle Scholar
Shi, Y., Feng, D., Cheng, Y. and Biswas, S., “A natural language-inspired multilabel video streaming source identification method based on deep neural networks,” Signal Image Video Process. 15(6), 11611168 (2021).CrossRefGoogle Scholar
Liu, T., Huang, J. and Zhao, J., “Research on obstacle avoidance of underactuated autonomous underwater vehicle based on offline reinforcement learning,” Robotica 1(1), 125 (2024).Google Scholar
Zhang, J., Liu, Y., Zhang, S., Poppe, R. and Wang, M., “Light field saliency detection with deep convolutional networks,” IEEE Trans. Image Process. 29(18), 44214434 (2020).CrossRefGoogle Scholar
Liu, H., Sun, X., Gu, J. and Deng, L., “SF-YOLOv5: A lightweight small object detection algorithm based on improved feature fusion mode,” Sensors 22(15), 58175836 (2022).CrossRefGoogle ScholarPubMed
Chang, S., Gao, F. and Zhang, Q., “A systematic review and analysis of deep learning-based underwater object detection,” Electronics 12(13), 28822896 (2023).CrossRefGoogle Scholar
Zhang, W., Li, X., Huang, Y., Xu, S., Tang, J. and Hu, H., “Underwater image enhancement via frequency and spatial domains fusion,” Opt. Lasers Eng. 186(9), 10088261008842 (2025).CrossRefGoogle Scholar
Jiang, X., Zhuang, X., Chen, J., Zhang, J. and Zhang, Y., “YOLOv8-MU: An improved YOLOv8 underwater detector based on a large kernel block and a multi-branch reparameterization module,” Sensors 24(9), 29052919 (2024).CrossRefGoogle Scholar
Xu, Z., Wang, R. and Cao, T., “AquaPile-YOLO: Pioneering underwater pile foundation detection with forward-looking sonar image processing,” Remote Sens. 17(3), 360376 (2025).CrossRefGoogle Scholar
Cai, L., Zhang, B., Li, Y. and Chai, H., “IFE-net: Improved feature enhancement network for weak feature target recognition in autonomous underwater vehicles,” Robotica 42(4), 12311245 (2024).CrossRefGoogle Scholar
Zhou, L., Cai, L., Jia, J. and Gao, Y., “Multi-scale aware turbulence network for underwater object recognition,” Front. Mar. Sci. 11(25), 13010721301091 (2024).CrossRefGoogle Scholar
Varghese, R. and Sambath, M.. YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness. In: 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), Chennai, India (2024) pp. 16.Google Scholar
Qu, R., Cui, C., Duan, J., Lu, Y. and Pang, Z., “Underwater small target detection under YOLOv8-LA model,” Sci. Rep. 14(1), 1610816124 (2024).CrossRefGoogle ScholarPubMed
Lu, Z., Liao, L., Xie, X. and Yuan, H., “SCoralDet: Efficient real-time underwater soft coral detection with YOLO,” Ecol. Inform. 85(7), 102937102956 (2025).CrossRefGoogle Scholar
Niu, Z., Zhong, G. and Yu, H., “A review on the attention mechanism of deep learning,” Neurocomputing 452(22), 4862 (2021).CrossRefGoogle Scholar
Fei, Y., Liu, F. and Su, M., “Real-time detection of small underwater organisms with a novel lightweight SFESI-YOLOv8n model,” J. Real-Time Image Process. 22(7), 2343 (2025).CrossRefGoogle Scholar
Quan, J., Zhao, Z., Li, W., Cao, Y. and Wu, J., “Enhancing YOLOv8n with multiple attention and MRV module for efficient deep-sea pipeline target detection,” Electronics 14(2), 267288 (2025).CrossRefGoogle Scholar
Zhou, Z., Hu, Y., Yang, X. and Yang, J., “YOLO-based marine organism detection using two-terminal attention mechanism and difficult-sample resampling,” Appl. Soft. Comput. 153(16), 111291111305 (2024).CrossRefGoogle Scholar
Max, J., Karen, S., Andrew, Z. and Koray, K., “Spatial transformer networks,” Adv. Neur. Inf. Process. Syst. 28(22), 12911305 (2015).Google Scholar
Li, W., Zhang, X., Peng, Y. and Dong, M.. Squeeze-and-Excitation Networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA (2018) pp. 71327141.Google Scholar
Fu, J., J.Liu, H. T., Li, Y., Bao, Y., Fang, Z. and Lu, H.. Dual Attention Network for Scene Segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA (2019) pp. 31413149.Google Scholar
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W. and Hu, Q.. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA (2020) pp. 1153111539.Google Scholar
Li, Z., Dong, Y., Shen, L. and YLiu, “Development and challenges of object detection: A survey,” Neurocomputing 598(22), 128102128125 (2024).CrossRefGoogle Scholar
Xin, Z., Chen, S., Wu, T. and Shao, Y., “Few-shot object detection: Research advances and challenges,” Inf. Fusion 107(42), 102307102332 (2024).CrossRefGoogle Scholar
Lin, T.-Y., Dollár, P., Girshick, R., Hariharan, B. and Belongie, S.. Feature Pyramid Networks for Object Detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA (2017) pp. 936944.Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J. and Jia, J.. Path Aggregation Network for Instance Segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA (2018) pp. 87598768.Google Scholar
Ghiasi, G., Lin, T.-Y and Le, Q. V.. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA (2019) pp. 70297038.Google Scholar
Chen, L., Fu, Y., Gu, L., Yan, C., Harada, T. and Huang, G., “Frequency-aware feature fusion for dense image prediction,” IEEE Trans. Pattern Anal. Mach. Intell. 46(12), 1076310780 (2024).CrossRefGoogle ScholarPubMed
Sun, C. and Zhao, F., “Multi-level feature fusion network for neuronal morphology classification,” Front. Neurosci. 18(16), 14656421465663 (2024).CrossRefGoogle Scholar
Zhang, T., Liu, Y. and Zhao, Q., “Edge-guided multi-scale adaptive feature fusion network for liver tumor segmentation,” Sci. Rep. 14(1), 2837028392 (2024).CrossRefGoogle Scholar
Ding, X., Zhang, Y., Ge, Y., Zhao, S., Song, L., Yue, X. and Shan, Y.. UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) (2024), pp. 55135524.Google Scholar
Woo, S., Debnath, S., Hu, R., Chen, X., Liu, Z., Kweon, I. S. and Xie, S.. ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada (2023) pp. 1613316142.Google Scholar
Shao, D., Jiang, J. and Ma, L., “Real-time medical lesion screening: Accurate and rapid detectors,” J. Real-Time Image Process. 21(4), 134158 (2024).CrossRefGoogle Scholar
Ren, S., He, K., Girshick, R. and Sun, J.. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA (2014) pp. 580587.Google Scholar
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q. and Tian, Q.. CenterNet: Keypoint Triplets for Object Detection. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South) (2019) pp. 65686577.Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A. and Zagoruyko, S., “End-to-end object detection with transformers,” Eur. Conf. Comput. Vision (ECCV), Glasgow, UK 12346, 213229 (2020).Google Scholar
Khachatrian, E., Sandalyuk, N. and Lozou, P., “Eddy detection in the marginal ice zone with sentinel-1 data using YOLOv5,” Remote Sens. 15(9), 22442259 (2023).CrossRefGoogle Scholar
Li, Z., Xu, P., Chang, X., Yang, L., Zhang, Y., Yao, L. and Chen, X., “When object detection meets knowledge distillation: A survey,” IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 1055510579 (2023).CrossRefGoogle ScholarPubMed
Wang, C.-Y., Bochkovskiy, A. and Liao, H.-Y.M.. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada (2023) pp. 74647475.Google Scholar
Kang, C. H. and Kim, S. Y., “Real-time object detection and segmentation technology: An analysis of the YOLO algorithm,” JMST Adv. 5(2), 6976 (2023).CrossRefGoogle Scholar
Qiu, X., Chen, Y., Cai, W., Niu, M. and Li, J., “LD-YOLOv10: A lightweight target detection algorithm for drone scenarios based on YOLOv10,” Electronics 13(16), 32693287 (2024).CrossRefGoogle Scholar
Muhammad, M. B. and MYeasin, “Eigen-CAM: Visual explanations for deep convolutional neural networks,” SN Comput. Sci. 2(6), 4759 (2021).CrossRefGoogle Scholar
Xinde, L., Dunkin, F. and Dezert, J., “Multi-source information fusion: Progress and future,” Chin. J. Aeronaut. 37(7), 2458 (2024).Google Scholar
Cai, Y., Sui, X. and Gu, G., “Multi-modal interaction with token division strategy for RGB-T tracking,” Pattern Recognit. 115(6), 110626110648 (2024).CrossRefGoogle Scholar
Shi, X., Zhang, Y. and Pujahari, A., “When latent features meet side information: A preference relation based graph neural network for collaborative filtering,”Expert,” Syst. Appl. 260(11), 125423125451 (2025).CrossRefGoogle Scholar
Guan, D., Xing, Y. and Huang, J., “S2Match: Self-paced sampling for data-limited semi-supervised learning,” Pattern Recognit. 159(26), 111121111143 (2025).CrossRefGoogle Scholar