跨尺度特征融合的遥感微小目标检测算法

doi:10.12305/j.issn.1001-506X.2025.05.05

摘要/Abstract

摘要：

针对遥感图像微小目标检测中存在的浅层细化特征、深层语义表征和多尺度信息提取3个问题, 提出一种综合运用多项技术的跨尺度YOLOv7 (cross-scale YOLOv7, CSYOLOv7)网络。首先, 设计跨阶段特征提取模块(cross-stage feature extraction module, CFEM)和感受野特征增强模块(receptive field feature enhancement module, RFFEM)。CFEM提高模型细化特征提取能力并抑制浅层下采样过程中特征的丢失, RFFEM加大网络对深层语义特征的提取力度, 增强模型对目标上下文信息获取能力。其次, 设计跨梯度空间金字塔池化模块(cross-gradient space pyramid pool module, CSPPM)有效融合微小目标多尺度的全局和局部特征。最后，用形状感知交并比(shape-aware intersection over union, Shape IoU)替换完全交并比(complete intersection over union, CIoU)，提高模型在边界框定位任务中的精确度。实验结果表明，CSYOLOv7网络在DIOR(dataset for image object recognition)数据集和NWPU VHR-10(Northwestern Polytechnical University Very High Resolution-10)数据集上分别取得了74%和89.6%的检测精度，有效提升遥感图像微小目标的检测效果。

关键词: 遥感图像, 微小目标, 特征提取, 上下文信息

Abstract:

For three problems of shallow thinning features, deep semantic representation and multi-scale information extraction for the detection of small targets in remote sensing images, a cross-scale YOLOv7(CSYOLOv7) network by comprehensively applying multiple technologies is proposed. Firstly, a cross-stage feature extraction module (CFEM) and a receptive field feature enhancement module (RFFEM) are designed. CFEM is to improve the model's ability of refining feature extraction and suppress the loss of features during shallow down-sampling. RFFEM is to increase the network's ability of extracting deep semantic features and improve the model's ability of acquiring target context information. Secondly, a cross-gradient space pyramid pool module (CSPPM) is designed to effectively fuse global multi-scale and local features of small targets. Finally, shape intersection over union (Shape IoU) is used to replace the complete intersection over union (CIoU) to improve the accuracy of the model in the bounding box positioning task. Experimental results show that the CSYOLOv7 network achieves detection accuracy of 74% and 89.6% on the Dataset for Image Object Recognition (DIOR) data set and Northwestern Polytechnical University Very High Resolution-10 (NWPU VHR-10) data set respectively, which effectively improves the detection effect of small targets in remote sensing images.

Key words: remote sensing image, small target, feature extraction, context information

中图分类号:

TP751.2

邵凯, 李浩刚, 梁燕, 宁婧, 陈戊. 跨尺度特征融合的遥感微小目标检测算法[J]. 系统工程与电子技术, 2025, 47(5): 1421-1431.

Kai SHAO, Haogang LI, Yan LIANG, Jing NING, Wu CHEN. Remote sensing small target detection algorithm based on cross-scale feature fusion[J]. Systems Engineering and Electronics, 2025, 47(5): 1421-1431.

图/表 14

图1

图2

图3

图4

图5

图6

表1

表2

表3

图7

图8

图9

图10

图11

参考文献 39

1	成倩, 李佳, 杜娟. 基于YOLOv5的光学遥感图像舰船目标检测算法[J]. 系统工程与电子技术, 2023, 45 (5): 1270- 1276. doi: 10.12305/j.issn.1001-506X.2023.05.02
	CHENG Q , LI J , DU J . Ship target detection algorithm in optical remote sensing images based on YOLOv5[J]. Systems Engineering and Electronics, 2023, 45 (5): 1270- 1276. doi: 10.12305/j.issn.1001-506X.2023.05.02
2	宋存利, 柴伟琴, 张雪松. 基于改进YOLO v5算法的道路小目标检测[J]. 系统工程与电子技术, 2024, 46 (10): 3271- 3278. doi: 10.12305/j.issn.1001-506X.2024.10.04
	SONG C L , CHAI W Q , ZHANG X S . Road small target detection based on improved YOLO v5 algorithm[J]. Systems Engineering and Electronics, 2024, 46 (10): 3271- 3278. doi: 10.12305/j.issn.1001-506X.2024.10.04
3	XU J, FU K, SUN X. An invariant generalized Hough transform based method of inshore ships detection[C]//Proc. of the International Symposium on Image and Data Fusion, 2011.
4	TAO C , TAN Y H , CAI H J , et al. Airport detection from large IKONOS images using clustered SIFT keypoints and region information[J]. IEEE Geoscience and Remote Sensing Letters, 2010, 8 (1): 128- 132.
5	KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84- 90. doi: 10.1145/3065386
6	JIAO Y H, XING L. Vehicle target detection research based on enhanced YOLOv8[C]//Proc. of the 4th International Confe-rence on Neural Networks, Information and Communication, 2024: 1427-1432.
7	IRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
8	REN S Q , HE K M , GIRSHICK R , et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2016, 39 (6): 1137- 1149.
9	HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]// Proc. of the IEEE International Conference on Computer Vision, 2017: 2961-2969.
10	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
11	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proc. of the 14th European Conference, 2016: 21-37.
12	LI C Y , CONG R M , GUO C L , et al. A parallel down-up fusion network for salient object detection in optical remote sensing images[J]. Neurocomputing, 2020, 415, 411- 420. doi: 10.1016/j.neucom.2020.05.108
13	韩子硕, 范喜全, 付强, 等. 面向无人机视角的多源信息融合目标检测[J]. 系统工程与电子技术, 2025, 47 (1): 52- 61.
	HAN Z S , FAN X Q , FU Q , et al. Multi-source information fusion target detection from the perspective of drones[J]. Systems Engineering and Electronics, 2025, 47 (1): 52- 61.
14	ZHAO C A , GUO D D , SHAO C F , et al. SatDetX-YOLO: a more accurate method for vehicle target detection in satellite remote sensing imagery[J]. IEEE Access, 2024, 12, 46024- 46041. doi: 10.1109/ACCESS.2024.3382245
15	QU J S , SU C , ZHANG Z W , et al. Dilated convolution and feature fusion SSD network for small object detection in remote sensing images[J]. IEEE Access, 2020, 8, 82832- 82843. doi: 10.1109/ACCESS.2020.2991439
16	CHEN H B , JIANG S , HE G H , et al. TEANS: a target enhancement and attenuated nonmaximum suppression object detector for remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 18 (4): 632- 636.
17	ULTRALYTICS. YOLOv5[EB/OL]. [2024-04-20]. https://github.com/ultralytics/YOLOv5, 2021.
18	WANG C Y, YEH I H, MARK LIAO H Y. Yolov9: learning what you want to learn using programmable gradient information[C]//Proc. of the European Conference on Computer Vision, 2024.
19	DONG Z , LIN B J . BMF-CNN: an object detection method based on multi-scale feature fusion in VHR remote sensing images[J]. Remote Sensing Letters, 2020, 11 (3): 215- 224. doi: 10.1080/2150704X.2019.1706007
20	SHEN L Y , LANG B H , SONG Z X . DS-YOLOv8-based object detection method for remote sensing images[J]. IEEE Access, 2023, 11, 125122- 125137. doi: 10.1109/ACCESS.2023.3330844
21	HE M , QIN L , DENG X L , et al. MFI-YOLO: multi-fault insulator detection based on an improved YOLOv8[J]. IEEE Trans.on Power Delivery, 2023, 39 (1): 168- 179.
22	ZHANG S Z, TUO H Y, HU J, et al. Domain adaptive YOLO for one-stage cross-domain detection[C]//Proc. of the Asian Conference on Machine Learning, 2021: 785-797.
23	邵凯, 王明政, 王光宇. 基于Transformer的多尺度遥感语义分割网络[J]. 智能系统学报, 2024, 19 (4): 920- 929.
	SHAO K , WANG M Z , WANG G Y . Multi-scale remote sensing semantic segmentation network based on Transformer[J]. Journal of Intelligent Systems, 2024, 19 (4): 920- 929.
24	梁燕, 易春霞, 王光宇. 基于编解码网络UNet3+的遥感影像建筑变化检测[J]. 计算机学报, 2023, 46 (8): 1720- 1733.
	LIANG Y , YI C X , WANG G Y . Building change detection in remote sensing images based on encoding and decoding network UNet3+[J]. Journal of Computer Science, 2023, 46 (8): 1720- 1733.
25	HE K M , ZHANG X Y , REN S Q , et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2015, 37 (9): 1904- 1916. doi: 10.1109/TPAMI.2015.2389824
26	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proc. of the IEEE/CVF International Conference on Computer Vision, 2023: 7464-7475.
27	ZHANG H, ZHANG S J. Shape-IoU: more accurate metric considering bounding box shape and scale[EB/OL]. [2024-04-20]. http://arXivpreprintarXiv:2312.17663, 2023.
28	梁燕, 易春霞, 王光宇, 等. 基于多尺度语义编解码网络的遥感图像语义分割[J]. 电子学报, 2023, 51 (11): 3199- 3214.
	LIANG Y , YI C X , WANG G Y , et al. Semantic segmentation of remote sensing images based on multi-scale semantic encoding and decoding network[J]. Journal of Electronics, 2023, 51 (11): 3199- 3214.
29	ZHANG X Y, ZHOU X Y, LIN M X, et al. Shufflenet: an extremely efficient convolutional neural network for mobile devices[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6848-6856.
30	ZHANG X, LIU C, YANG D G, et al. Rfaconv: innovating spatital attention and standard convolutional operation[EB/OL]. [2024-04-20]. https://arXivpreprintarXiv:2304.03198, 2023.
31	WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 390-391.
32	SALMAN H, PARKS C, SWAN M, et al. OrthoNets: orthogonal channel attention networks[C]//Proc. of the IEEE International Conference on Big Data, 2023: 829-837.
33	CHRISTLEIN V, SPRANGER L, SEURET M, et al. Deep generalized max pooling[C]//Proc. of the International Conference on Document Analysis and Recognition, 2019: 1090-1096.
34	CHEN Y M , YUAN X B , WANG J B , et al. YOLO-MS: rethinking multi-scale representation learning for real-time object detection[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2025, 3538473.
35	LI K , WAN G , CHENG G , et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159, 296- 307. doi: 10.1016/j.isprsjprs.2019.11.023
36	CHENG G , HAN J W . A survey on object detection in optical remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2016, 117, 11- 28. doi: 10.1016/j.isprsjprs.2016.03.014
37	GE Z, LIU S T, WANG F, et al. Yolox: exceeding yolo series in 2021[EB/OL]. [2024-04-20]. https://arxiv.org/abs/2107.08430.
38	ZHAO Y A, LV W Y, XU S L, et al. Detrs beat yolos on real-time object detection[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024: 16965-16974.
39	ULTRALYTICS. YOLOv8[EB/OL]. [2024-04-20]. https://github.com/ultralytics/ultralytics, 2023.

算法	C1	C2	C3	C4	C5	C6	C7	C8	C9	C10	C11	C12	C13	C14	C15	C16	C17	C18	C19	C20	mAP/%
Faster-RCNN^[8]	12.3	20.5	15.2	61.6	48.8	70.8	40.2	79.1	24.3	55.4	47.7	42.7	43.1	65.2	53.9	84.4	51.9	48.2	72.7	84.8	51.1
Mask-RCNN^[9]	54.0	53.6	71.9	53.8	72.3	73.0	55.9	81.1	38.7	71.6	62.5	44.2	54.0	58.6	75.8	63.2	80.1	56.5	72.6	81.0	63.5
SSD^[11]	34.0	54.8	82.2	49.7	78.0	64.9	45.4	85.6	24.8	52.2	50.5	53.5	39.5	58.3	63.5	71.9	75.2	47.9	74.9	88.4	59.8
YOLOv5^[17]	54.2	76.5	89.1	73.7	81.6	71.9	51.7	86.9	42.1	62.5	59.6	59.5	49.8	72.9	71.6	78.5	77.7	56.9	76.2	88.1	69.1
YOLOX^[37]	55.8	70.8	86.9	69.1	88.3	70.1	53.7	88.5	41.4	62.5	61.3	59.0	53.4	71.4	72.1	72.0	76.7	56.3	72.6	88.2	68.5
RT-DETR^[38]	54.6	74.8	88.5	74.7	76.7	71.6	52.1	83.9	37.2	61.7	56.8	59.0	54.7	64.4	66.6	72.3	77.0	54.3	72.9	86.1	67.0
YOLOv8^[39]	53.5	77.8	91.0	78.1	82.6	77.3	60.9	88.9	42.3	64.9	59.2	64.3	53.9	72.4	73.1	79.8	78.8	59.0	77.7	88.6	71.2
YOLOv7^[26]	53.8	77.6	90.1	74.6	92.5	70.3	48.9	90.5	41.7	68.4	61.7	61.5	50.2	81.4	76.6	83.7	80.1	57.9	76.6	89.4	71.4
CSYOLOv7	58.0	78.4	90.5	77.2	93.4	78.2	58.8	90.6	44.9	74.4	65.2	63.2	53.1	83.8	77.8	84.4	80.2	59.6	77.7	90.1	74.0

算法	airplane	ship	S-T	B-D	T-C	B-C	G-T-F	harbor	bridge	vehicle	GFLOPs(×10⁹)	Params(×10⁶)	mAP/%
Faster-RCNN^[8]	92.5	67.2	38.7	97.1	60.5	16.4	92.4	67.9	70.5	33.4	67.5	41.4	63.7
Mask-RCNN^[9]	93.0	75.5	92.9	90.4	90.3	91.2	95.1	75.2	60.5	74.2	114	43.7	83.9
SSD^[11]	94.5	86.8	93.2	97.6	85.6	83.7	91.4	78.7	69.2	90.1	88.8	25.7	87.1
YOLOv5^[17]	99.0	84.4	86.2	98.3	76.2	69.2	90.6	83.3	58.8	86.1	15.9	7.1	83.2
YOLOX^[37]	99.6	79.8	90.6	90.9	74.7	74.2	99.0	69.3	75.7	82.7	26.7	8.9	83.7
RT-DETR^[38]	98.5	86.8	89.3	94.8	86.0	72.8	95.4	80.2	71.4	82.4	58.4	20.1	85.8
YOLOv8^[39]	99.2	87.6	83.2	98.8	76.3	79.6	98.4	89.0	65.2	82.4	28.4	11.2	86.0
YOLOv7^[26]	99.4	86.4	83.4	99.0	77.4	78.2	93.8	90.3	81.3	77.6	13.3	6.1	86.5
CSYOLOv7	99.5	87.8	98.6	98.7	87.6	82.4	96.4	88.1	75.8	80.7	19.1	7.4	89.6

算法	RFFEM	CFEM	CSPPM	Shape IoU	P/%	R/%	Params(×10⁶)	mAP0.5/%
YOLOv7	-	-	-	-	79.9	67.2	6.1	71.4
实验a	√	-	-	-	82.2	67.2	6.2	72.3
实验b	-	√	-	-	80.9	68.2	6.2	72.2
实验c	-	-	√	-	80.8	67.1	7.2	71.8
实验d	√	√	-	-	82.6	68.1	6.2	73.3
实验e	√	-	√	-	82.2	67.7	7.3	72.6
实验f	√	√	√	-	83.6	67.8	7.4	73.6
本文算法	√	√	√	√	83.1	68.3	7.4	74.0

[1]	李开明, 代肖楠, 张袁鹏, 姚佳文, 罗迎. 基于动态模态分解的弹道目标平动补偿与微动特征提取方法[J]. 系统工程与电子技术, 2025, 47(2): 451-462.
[2]	汪强龙, 高晓光, 吴必聪, 胡子剑, 万开方. 受限玻尔兹曼机及其变体研究综述[J]. 系统工程与电子技术, 2024, 46(7): 2323-2345.
[3]	阳鹏飞, 何羚, 王茜, 王睿笛, 张明志. 基于混合信号多域特征和Transformer的干扰识别[J]. 系统工程与电子技术, 2024, 46(6): 2138-2145.
[4]	杨德贵, 许道峰. 基于时频域特征融合的IR-UWB穿墙雷达人体行为识别方法[J]. 系统工程与电子技术, 2024, 46(3): 849-858.
[5]	刘燊文, 崔兴超, 陈思伟. 结合空时上下文信息的视频SAR图像相干斑滤波[J]. 系统工程与电子技术, 2024, 46(2): 446-458.
[6]	蔡嘉怡, 初萍, 庄伦涛, 阳召成. 基于空间属性特征的毫米波雷达身体干扰识别[J]. 系统工程与电子技术, 2024, 46(10): 3365-3374.
[7]	赵晓枫, 牛家辉, 刘春桐, 夏玉婷. 基于三维注意力与混合卷积的高光谱图像分类[J]. 系统工程与电子技术, 2023, 45(9): 2673-2680.
[8]	方伟, 梁静雯, 陆恒杨. 基于聚类锦标赛与父代匹配的遗传规划算法[J]. 系统工程与电子技术, 2023, 45(8): 2405-2414.
[9]	王湖升, 陈伯孝, 叶倾知. 基于箔条干扰实测数据的对抗方法研究[J]. 系统工程与电子技术, 2023, 45(7): 2010-2021.
[10]	杨帆, 马萍, 李伟, 杨明. 基于孪生网络的仿真模型智能排序评估方法[J]. 系统工程与电子技术, 2023, 45(7): 2060-2068.
[11]	成倩, 李佳, 杜娟. 基于YOLOv5的光学遥感图像舰船目标检测算法[J]. 系统工程与电子技术, 2023, 45(5): 1270-1276.
[12]	张晔, 侯毅, 欧阳克威, 周石琳. 单变量序列数据分类方法综述[J]. 系统工程与电子技术, 2023, 45(2): 313-335.
[13]	刘丹阳, 吴堃, 朱永锋, 张永杰, 周剑雄. 地面目标HRRP识别的稳健性特征选择方法[J]. 系统工程与电子技术, 2023, 45(12): 3726-3733.
[14]	李浩然, 熊伟, 崔亚奇. 基于深度特征融合的SAR图像与AIS信息关联方法[J]. 系统工程与电子技术, 2023, 45(11): 3491-3497.
[15]	李波, 胡哿郗, 石剑钧, 刘恒畅, 洪涛. 基于多惩罚因子优化VMD的滚动轴承故障特征提取方法[J]. 系统工程与电子技术, 2023, 45(11): 3690-3698.