基于可变尺度先验框的声呐图像目标检测

doi:10.12305/j.issn.1001-506X.2024.03.01

摘要/Abstract

摘要：

利用深度学习对声呐图像进行目标检测是近年来的研究热点, 然而声呐图像存在目标尺度分布集中、数据获取难等问题, 导致检测效果难以满足需求。针对该问题, 提出了一种基于可变尺度先验框的目标检测方法。首先, 考虑到声呐图像中目标的尺度分布具有其特殊性, 基于先验统计生成可变尺度先验框。其次, 为了解决声呐图像稀缺的难题, 采用数据增强的方法对训练集进行扩充。最后, 探索了模型的轻量化, 通过删减模型的大目标检测层, 在不降低模型精度的同时简化模型结构。为了评估算法的有效性, 以前视声呐图像为例进行了综合试验, 平均精度(mean average precision, mAP)@0.75和mAP@0.5:0.95分别达0.585和0.559, 较原Yolov5网络分别提升了5.8%和3.1%, 同时每秒10亿次浮点运算次数下降到14.9。结果表明，所提算法具有更高的精度和更轻量化的模型结构。

关键词: 声呐图像, 目标检测, 数据增强, 尺度聚类, 轻量化模型

Abstract:

In recent years, target detection in sonar images using deep learning has become a hot research topic. However, sonar images have problems such as the concentration of target scale distribution and the difficulty of data acquisition, which makes the detection effect difficult to meet the requirements. A target detection method based on variable scale prior frame is proposed to address this issue. Firstly, considering the particularity of target scale distribution in sonar images, variable scale prior frames are generated based on prior statistics. Secondly, in order to solve the problem of sonar image scarcity, data augmentation methods are used to expand the training set. Finally, the lightweighting of the model is explored by deleting the large object detection layer of the model, simplifying the model structure without reducing model accuracy. In order to evaluate the effectiveness of the algorithm, comprehensive experiments are conducted on forward-looking sonar images as an example to determine the mean average precision (mAP)@0.75 and mAP@0.5:0.95 reached 0.585 and 0.559 respectively, which increased by 5.8% and 3.1% compared to the original Yolov5 network, while giga floating-point operations (GFLOPs) decreased to 14.9. The results show that the proposed algorithm has higher accuracy and a light weight model structure.

Key words: sonar image, target detection, data enhancement, scale clustering, lightweight model

中图分类号:

TP391.4

黄思佳, 宋纯锋, 李璇. 基于可变尺度先验框的声呐图像目标检测[J]. 系统工程与电子技术, 2024, 46(3): 771-778.

Sijia HUANG, Chunfeng SONG, Xuan LI. Target detection in sonar images based on variable scale prior frame[J]. Systems Engineering and Electronics, 2024, 46(3): 771-778.

图/表 8

图1

图2

图3

表1

图4

图5

表2

表3

参考文献 32

1	孙瑜阳. 深度学习及其在图像分类识别中的研究综述[J]. 信息技术与信息化, 2018, (1): 138- 140.
	SUN Y Y . A review of deep learning and its application in image classification and recognition[J]. Information Technology and Informatization, 2018, (1): 138- 140.
2	余璀璨, 李慧斌. 基于深度学习的人脸识别方法综述[J]. 工程数学学报, 2021, 38 (4): 451- 469. doi: 10.3969/j.issn.1005-3085.2021.04.001
	YU C C , LI H B . A review of face recognition methods based on deep learning[J]. Chinese Journal of Engineering Mathematics, 2021, 38 (4): 451- 469. doi: 10.3969/j.issn.1005-3085.2021.04.001
3	WANG Y M , LI Y , ZOU H . Masked face recognition system based on attention mechanism[J]. Information, 2023, 14 (2): 87. doi: 10.3390/info14020087
4	ZHU Z , HUANG G , DENG J K , et al. WebFace260M: a benchmark for million-scale deep face recognition[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2023, 45 (2): 2627- 2644. doi: 10.1109/TPAMI.2022.3169734
5	佟超, 韩勇, 冯巍, 等. 医学图像深度学习处理方法的研究进展[J]. 北京生物医学工程, 2021, 40 (2): 198- 202.
	TONG C , HAN Y , FENG W , et al. Research progress in deep learning processing methods for medical images[J]. Beijing Biomedical Engineering, 2021, 40 (2): 198- 202.
6	JIANG H Y , DIAO Z S , SHI T Y , et al. A review of deep learning-based multiple-lesion recognition from medical images: classification, detection and segmentation[J]. Computers in Biology and Medicine, 2023, 157, 106726. doi: 10.1016/j.compbiomed.2023.106726
7	SMITHSON C J R , EICHBAUM Q G , GAUTHIER I . Object recognition ability predicts category learning with medical images[J]. Cognitive Research: Principles and Implications, 2023, 8 (1): 9. doi: 10.1186/s41235-022-00456-9
8	王协, 章孝灿, 苏程. 基于多尺度学习与深度卷积神经网络的遥感图像土地利用分类[J]. 浙江大学学报(理学版), 2020, 47 (6): 715- 723.
	WANG X , ZHANG X C , SU C . Land use classification of remote sensing images based on multi-scale learning and depth Convolutional neural network[J]. Journal of Zhejiang University (Science Edition), 2020, 47 (6): 715- 723.
9	GADIRAJU K K , VATSAVAI R R . Remote sensing based crop type classification via deep transfer learning[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16, 4699- 4712. doi: 10.1109/JSTARS.2023.3270141
10	HAO X J , LIU L , YANG R J , et al. A review of data augmentation methods of remote sensing image target recognition[J]. Remote Sensing, 2023, 15 (3): 827. doi: 10.3390/rs15030827
11	郑云亮. 基于改进YOLOv5网络的侧扫声呐图像目标检测方法[J]. 海洋测绘, 2022, 42 (4): 18-21, 26.
	ZHENG Y L . A side scan sonar image target detection method based on improved YOLOv5 network[J]. Hydrographic Surveying and Charting, 2022, 42 (4): 18-21, 26.
12	盛子旗, 霍冠英. 样本仿真结合迁移学习的声呐图像水雷检测[J]. 智能系统学报, 2021, 16 (2): 385- 392.
	SHENG Z Q , HUO G Y . Sonar image mine detection based on sample simulation and transfer learning[J]. CAAI Transactions on Intelligent Systems, 2021, 16 (2): 385- 392.
13	王晓, 关志强, 王静, 等. 基于卷积神经网络的彩色图像声呐目标检测[J]. 计算机应用, 2019, 39 (Z1): 187- 191.
	WANG X , GUAN Z Q , WANG J , et al. Color image sonar target detection based on convolutional neural network[J]. Computer Application, 2019, 39 (Z1): 187- 191.
14	TANG X Y, ZHANG X W, XU X L, et al. Methods for underwater sonar image processing in objection detection[C]//Proc. of the International Conference on Computer Systems, Electronics and Control, 2017: 941-944.
15	金磊磊, 梁红, 杨长生. 基于卷积神经网络的水下目标声呐图像识别方法[J]. 西北工业大学学报, 2021, 39 (2): 285- 291.
	JIN L L , LIANG H , YANG C S . Underwater target sonar image recognition method based on convolutional neural network[J]. Journal of Northwestern Polytechnical University, 2021, 39 (2): 285- 291.
16	SUNG M S, LEE M S, KIM J, et al. Convolutional-neural-network-based underwater object detection using sonar image simulator with randomized degradation[C]//Proc. of the Oceans MTS/IEEE Seattle, 2019.
17	SUNG M S , KIM J , LEE M S , et al. Realistic sonar image simulation using deep learning for underwater object detection[J]. International Journal of Control, Automation and Systems, 2020, 18 (3): 523- 534.
18	WANG Z , ZHANG S W , HUANG W Z , et al. Sonar image target detection based on adaptive global feature enhancement network[J]. IEEE Sensors Journal, 2021, 22 (2): 1509- 1530.
19	FAN X N , LU L , SHI P F , et al. A novel sonar target detection and classification algorithm[J]. Multimedia Tools and Applications, 2022, 81, 10091- 10106.
20	ZHOU T , SI J K , WANG L Y , et al. Automatic detection of underwater small targets using forward-looking sonar images[J]. IEEE Trans.on Geoscience and Remote Sensing, 2022, 60, 4207912.
21	李书东, 王晓, 张博宇, 等. 基于改进YOLOX的侧扫声呐图像沉船检测方法研究[J]. 海洋测绘, 2022, 42 (5): 32- 36.
	LI S D , WANG X , ZHANG B Y , et al. Research on side scan sonar image sunken ship detection method based on improved YOLOX[J]. Hydrographic Surveying and Charting, 2022, 42 (5): 32- 36.
22	GIRSHICK R. Fast R-CNN[C]//Proc. of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
23	REN S , HE K M , GIRSHICK R , et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149.
24	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
25	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517-6525.
26	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2023-04-05]. arxiv. org/pdf/1804.02767. pdf.
27	BOCHKOVSKIY A, WANG C Y, LIAO H M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2023-04-05]. arxiv. org/abs/2004.10934.
28	XIE K B , YANG J , QIU K . A dataset with multibeam forward-looking sonar for underwater object detection[J]. Scientific Data, 2022, 9 (1): 739.
29	张济博, 潘国富, 丁维凤. 侧扫声呐图像改正研究[C]//中国东西部声学学术交流会论文集, 2015: 44-47.
	ZHANG J B, PAN G F, DING W F. Research on side scan sonar image correction[C]//Proc. of the East West Acoustics Academic Exchange Conference in China, 2015: 44-47.
30	PCL UnderwaterLab. URPC2021_sonar_images_dataset[EB/OL]. [2023-04-13]. https://openi.pcl.ac.cn/OpenOrcinus_orca/URPC2021_sonar_images_dataset.
31	陈昭俊, 储珺, 曾伦杰. 基于动态加权类别平衡损失的多类别口罩佩戴检测[J]. 图学学报, 2022, 43 (4): 590- 598.
	CHEN Z J , CHU J , ZENG L J . Multi category mask wearing detection based on dynamic weighted category balance loss[J]. Journal of Graphics, 2022, 43 (4): 590- 598.
32	ZHANG H T , TIAN M , SHAO G P , et al. Target detection of forward-looking sonar image based on improved YOLOv5[J]. IEEE Access, 2022, 10, 18023- 18034.

特征图大小	先验框大小
80×80×256	[12, 25]	[19, 20]	[26, 21]
40×40×512	[20, 30]	[31, 26]	[41, 25]
20×20×1 024	[26, 41]	[39, 37]	[69, 33]

网络模型	简化网络结构	设置先验框	数据增强	mAP@0.5	mAP@0.75	mAP@0.5:0.95	GFLOPs
Yolov5s	-	-	-	0.962	0.527	0.528	16.5
Yolov5s	-	√	-	0.971	0.550	0.542	16.5
Yolov5s	-	√	√	0.970	0.555	0.547	16.5
Yolov5m	-	-	-	0.968	0.559	0.546	49.0
Yolov5l	-	-	-	0.964	0.546	0.542	109.1
Yolov5s	√	-	-	0.968	0.547	0.540	14.9
Yolov5s	√	√	-	0.966	0.579	0.552	14.9
Yolov5s	√	√	√	0.971	0.585	0.559	14.9

网络模型	简化网络结构	设置先验框	mAP@0.5	mAP@0.75	mAP@0.5:0.95	GFLOPs
Yolov5s	-	-	0.975	0.612	0.564	16.50
Faster R-CNN	-	-	0.535	0.242	0.202	63.29
文献[32]	-	-	0.974	0.621	0.569	16.50
Yolov5s	√	-	0.976	0.604	0.561	14.90
Yolov5s	-	√	0.973	0.630	0.569	16.50
Yolov5s	√	√	0.979	0.628	0.568	14.90

[1]	扈琪, 胡绍海, 刘帅奇. 基于多层显著性模型的SAR图像舰船目标检测[J]. 系统工程与电子技术, 2024, 46(2): 478-487.
[2]	黄广佳, 程旭, 饶彬, 王伟. 基于广义Rao检验的单/多比特MIMO雷达运动目标检测方法[J]. 系统工程与电子技术, 2024, 46(1): 105-112.
[3]	施端阳, 林强, 胡冰, 杜小帅. 基于YOLO的航管一次雷达目标检测方法[J]. 系统工程与电子技术, 2024, 46(1): 143-151.
[4]	汪萌, 诸兵. 不确定性建模在2D和3D目标检测中的应用[J]. 系统工程与电子技术, 2023, 45(8): 2370-2376.
[5]	成倩, 李佳, 杜娟. 基于YOLOv5的光学遥感图像舰船目标检测算法[J]. 系统工程与电子技术, 2023, 45(5): 1270-1276.
[6]	朱晶晶, 朱圣棋, 廖桂生, 许京伟, 兰岚, 曾操. 相控阵和频率分集阵双模式雷达联合目标检测[J]. 系统工程与电子技术, 2023, 45(5): 1342-1350.
[7]	杨宇超, 方明, 赵晨帆, 方刚. 高速机动目标长时间相参积累算法[J]. 系统工程与电子技术, 2023, 45(5): 1359-1370.
[8]	张冬冬, 王春平, 付强. 基于特征增强网络的SAR图像舰船目标检测[J]. 系统工程与电子技术, 2023, 45(4): 1032-1039.
[9]	孟自强, 高伟, 李晓明. 机载雷达地面静止目标二维检测算法[J]. 系统工程与电子技术, 2023, 45(4): 1040-1048.
[10]	张昀普, 单甘霖, 黄燕, 付强. 考虑盲区的多移动传感器地面目标检测跟踪调度方法[J]. 系统工程与电子技术, 2023, 45(2): 453-464.
[11]	贺翥祯, 李敏, 苟瑶, 杨爱涛. 改进YOLOv5的合成孔径雷达图像舰船目标检测方法[J]. 系统工程与电子技术, 2023, 45(12): 3743-3753.
[12]	李志汇, 唐波, 周青松, 师俊朋, 张剑云. 新体制机载雷达波形优化设计研究综述[J]. 系统工程与电子技术, 2023, 45(12): 3852-3865.
[13]	陈任飞, 彭勇, 李忠文. 基于持续无监督域适应策略的水面漂浮物目标检测方法[J]. 系统工程与电子技术, 2023, 45(11): 3391-3401.
[14]	周剑雄, 朱永锋, 陈冀, 吴宏铭, 吴堃, 张永杰. SAR图像辅助的雷达目标距离像检测识别[J]. 系统工程与电子技术, 2023, 45(11): 3428-3436.
[15]	宋婷, 兀泽朝, 高艾, 袁建平. 基于CycleGAN的月表图像数据增强方法[J]. 系统工程与电子技术, 2023, 45(10): 3041-3048.