三分支空间变换注意力机制的图像匹配算法

doi:10.12305/j.issn.1001-506X.2023.11.01

摘要/Abstract

摘要：

对于待匹配图像具有旋转、缩放、平移等空间几何变换的图像模板匹配任务, 现有的算法耗时较长, 且准确率不高。针对该问题提出一种高准确率、低运算成本的图像匹配算法, 首先根据中心点与邻域点的像素差来寻找特征点, 进行快速特征检测, 然后以这些特征点为中心, 并以快速特征检测所计算出来的旋转角截取出一定尺寸的图像块。再将这些图像块输入空间变换注意力模块的特征描述子提取网络, 最后使用K最邻近算法计算两张待匹配图像特征描述子中匹配的特征。特征描述子提取网络中引入了空间变换注意力模块, 网络在训练的时候着重对空间信息进行学习, 故所提算法提高了具有较大空间变化图像匹配任务的准确率。在匹配时间方面, 所提的匹配算法仅次于检测和匹配都使用快速特征检测算法的方法。在匹配准确率方面, 所提算法匹配的准确率远远优于实验所比较的其他算法。

关键词: 计算机视觉, 图像匹配, 空间变换不变性, 卷积神经网络, 注意力机制

Abstract:

For the image template matching task in which the image to be matched has spatial transformations such as rotation transformation, scale transformation, and translation, the existing algorithms are time-consuming and have low accuracy. In this paper, an image matching algorithm with high accuracy is proposed. Firstly, oriented fast feature detection algorithm is used to find the feature points according to the pixel difference between the center point and the neighboring point in order to detect the feature points. With these feature points taken as the center, patches of a certain size are cropped according to the rotation angle calculated by the fast feauture detection algorithm. These image patches are fed into the feature descriptor extraction network of the spatial transform attention module. Finally, K-nearest neighbor (KNN) algorithm is used to calculate the matched features in the feature descriptors of the two images to be matched. The spatial transformation attention module is introduced into the feature descriptor extraction network. The network focuses on learning spatial information during training, so the entire algorithm improves the accuracy of image matching tasks with large spatial changes. In terms of matching time, the matching algorithm proposed in this paper is second only to the method that uses fast feauture detection algorithm algorithmfor both detection and matching tasks. In terms of matching accuracy, the matching accuracy of the algorithm in this paper is far superior to other algorithms compared in the experiment.

Key words: computer vision, image matching, space transformation invariance, convolutional neural network (CNN), attention mechanism

中图分类号:

TP751.1

黄妍妍, 盖绍彦, 达飞鹏. 三分支空间变换注意力机制的图像匹配算法[J]. 系统工程与电子技术, 2023, 45(11): 3363-3373.

Yanyan HUANG, Shaoyan GAI, Feipeng DA. Image matching algorithm based on attention mechanism of three branch spatial transformation[J]. Systems Engineering and Electronics, 2023, 45(11): 3363-3373.

图/表 12

图1

图2

图3

图4

图5

图6

图7

表1

表2

图8

表3

表4

参考文献 31

1	孙琨, 刘李漫, 陶文兵. 基于子空间映射和一致性约束的匹配传播算法[J]. 计算机学报, 2017, 40 (11): 2546- 2558. doi: 10.11897/SP.J.1016.2017.02546
	SUN K , LIU L M , TAO W B . A match propagation method based on subspace mapping and coherent constraint[J]. Chinese Journal of Computers, 2017, 40 (11): 2546- 2558. doi: 10.11897/SP.J.1016.2017.02546
2	MA J Y , JIANG X Y , FAN A X , et al. Image matching from handcrafted to deep features: a survey[J]. International Journal of Computer Vision, 2021, 129 (1): 23- 79. doi: 10.1007/s11263-020-01359-2
3	LOWE D G. Object recognition from local scale-invariant fea-tures[C]//Proc. of the IEEE 7th International Conference on Computer Vision, 1999: 1150-1157.
4	BAY H, TUYTELAARS T, GOOL L V. SURF: speeded up robust features[C]//Proc. of the European Conference on Computer Vision, 2006: 404-417.
5	MIKOLAJCZYK K , SCHMID C . A performance evaluation of local descriptors[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2005, 27 (10): 1615- 1630. doi: 10.1109/TPAMI.2005.188
6	ROSTEN E, DRUMMOND T. Machine learning for high-speed corner detection[C]//Proc. of the European Conference on Computer Vision, 2006: 430-443.
7	ROSTEN E , PORTER R , DRUMMOND T . Faster and better: a machine learning approach to corner detection[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2010, 32 (1): 105- 119. doi: 10.1109/TPAMI.2008.275
8	CALONDER M, LEPETIT V, STRECHA C, et al. Brief: binary robust independent elementary features[C]//Proc. of the European Conference on Computer Vision, 2010: 778-792.
9	RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: an efficient alternative to SIFT or SURF[C]//Proc. of the IEEE International Conference on Computer Vision, 2011: 2564-2571.
10	陈世伟, 夏海, 杨小冈, 等. 基于风格迁移不变特征的SAR与光学图像配准算法[J]. 系统工程与电子技术, 2022, 44 (5): 1536- 1542.
	CHEN S W , XIA H , YANG X G , et al. SAR and optical image registration algorithm based on style transfer invariable features[J]. Systems Engineering and Electronics, 2022, 44 (5): 1536- 1542.
11	刘自金, 石玉英. 基于信息熵的改进SIFT图像快速匹配算法[J]. 激光杂志, 2021, 42 (12): 129- 135.
	LIU Z J , SHI Y Y . Improved fast SIFT image matching algorithm based on information entropy[J]. Laser Journal, 2021, 42 (12): 129- 135.
12	TIAN Y R, FAN B C, WU F C, et al. L2-Net: deep learning of discriminative patch descriptor in Euclidean space[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6128-6136
13	MISHCHUK A, MISHKIN D, RADENOVIC F, et al. Working hard to know your neighbor ' s margins: local descriptor learning loss[C]//Proc. of the Advances in Neural Information Processing Systems, 2017: 4826-4837.
14	MILES R, MIKOLAJCZYK K. Compressing local descriptor models for mobile applications[C]//Proc. of the ICASSP IEEE International Conference on Acoustics, Speech and Signal Processing, 2021: 1895-1899.
15	GARG M , ARORA A , GUPTA S . An efficient human identification through iris recognition system[J]. Journal of Signal Processing Systems, 2021, 93 (6): 701- 708. doi: 10.1007/s11265-021-01646-2
16	LIU M Y, MU X D, HE X C. Heterogeneous image matching based on phase consistency[C]//Proc. of the IEEE Conference on Telecommunications, Optics and Computer Science, 2021: 864-868.
17	SHAHID M , ABHISHEK B , CHANNAPPAYYA S . A cross-platform HD dataset and a two-step framework for robust aerial image matching[J]. IEEE Access, 2022, 10, 66153- 66174. doi: 10.1109/ACCESS.2022.3184328
18	YU J X, CHEN Y P, LI S L, et al. Secondary matching algorithm: a new heterogeneous image matching algorithm for the UAV i mage and satellite remote sensing image[C]//Proc. of the IEEE International Geoscience and Remote Sensing Symposium, 2022: 3275-3278.
19	LIU K Z, LEE P J, XU G C, et al. SIFT-enhanced CNN based objects recognition for satellite image[C]//Proc. of the IEEE International Conference on Consumer Electronics, 2020.
20	ZHOU Z L , WU Q M J , WAN S H , et al. Partial-duplicate image detection[J]. IEEE Trans.on Emerging Topics in Computational Intelligence, 2020, 4 (5): 593- 604. doi: 10.1109/TETCI.2019.2909936
21	WINDER S A J, BROWN M. Learning local image descriptors[C]// Proc. of the Computer Vision and Pattern Recognition, 2007.
22	MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module[C]//Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021: 3139-3148.
23	WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7794-7803.
24	CAO Y, XU J R, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C]//Proc. of the IEEE/CVF International Conference on Computer Vision Workshop, 2020: 1971-1980.
25	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
26	JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proc. of the 28th International Conference on Neural Information Processing Systems, 2015: 2017-2025.
27	SNAVELY N , SEITZ S M , SZELISKI R . Photo tourism: exploring photo collections in 3D[J]. ACM Trans.on Graphics, 2006, 25 (3): 835- 846. doi: 10.1145/1141911.1141964
28	TIAN Y R, YU X, FAN B, et al. Sosnet: second order similarity regularization for local descriptor learning[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 11016-11025.
29	DUAN Y Q, LU J W, WANG Z W, et al. Learning deep binary descriptor with multi-quantization[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1183-1192.
30	YE J M , ZHANG S L , HUANG T J , et al. CDbin: compact discriminative binary descriptor learned with efficient neural network[J]. IEEE Trans.on Circuits and Systems for Video Technology, 2019, 30 (3): 862- 874.
31	GU X F, WANG Y F, MA T Y. DBLD-SLAM: a deep-learning visual SLAM system based on deep binary local descriptor[C]// Proc. of the International Conference on Control, Automation and Information Sciences, 2021: 325-330.

环境配置项	具体配置
操作系统CPU	Ubuntu 18.04 E3-1231 v3
GPU	NVIDIA GeForce GTX 1060
GPU显存/GB	6
内存/G	16
加速环境	CUDAv10.0、CuDNNv7.6
显卡驱动版本	470.129.06
Pytorch版本	1.2.0
Python版本	3.7.4

算法	匹配精度p	匹配分数R	匹配时间/s
ORB	0.827	0.086	3.52
HardNet	0.932	0.226	3.64
简化HardNet	0.940	0.245	3.59
DBD-MQ	0.801	0.088	4.96
CDBin	0.851	0.105	3.76
DBLD	0.878	0.112	3.82
SOSNet	0.956	0.198	3.65
本文算法	0.99	0.348	3.55

训练集	Liberty		Notredame		Yosemite
测试集	Notre-dame	Yose-mite	Liberty	Yose-mite	Liberty	Notre-dame	Mean
ORB	53.97	54.01	59.67	55.54	58.99	54.90	56.18
HardNet	1.13	5.70	2.90	4.53	4.29	1.31	3.31
简化 HardNet	0.97	3.39	2.56	4.06	4.18	1.28	2.74
SOSNet^[28]	1.07	3.68	3.32	2.32	4.12	1.30	2.64
DBD-MQ^[29]	26.11	56.98	34.56	56.87	35.32	27.65	39.58
CDBin^[30]	3.01	7.92	6.79	7.70	8.45	4.03	6.32
DBLD^[31]	1.34	3.47	2.76	2.81	4.51	1.31	2.70
本文算法	0.89	3.29	2.09	4.00	4.04	1.19	2.58

算法	参数量	参数大小/MB	模型大小/MB
HardNet	1 334 560	5.09	7.78
简化HardNet	1 288 480	4.92	6.48
本文算法	1 306 261	4.98	6.76

[1]	赵晓枫, 牛家辉, 刘春桐, 夏玉婷. 基于三维注意力与混合卷积的高光谱图像分类[J]. 系统工程与电子技术, 2023, 45(9): 2673-2680.
[2]	李海军, 孔繁程, 林云. 基于改进YOLOv5s的红外舰船检测算法[J]. 系统工程与电子技术, 2023, 45(8): 2415-2422.
[3]	杨帆, 马萍, 李伟, 杨明. 基于孪生网络的仿真模型智能排序评估方法[J]. 系统工程与电子技术, 2023, 45(7): 2060-2068.
[4]	邓喆, 雷菁, 孙承哲. 跳频信号盲检测的半监督干扰对消方法[J]. 系统工程与电子技术, 2023, 45(7): 2236-2248.
[5]	姜雨, 袁琪, 胡志韬, 吴薇薇, 顾欣. 基于气象因素的机场进离港延误预测[J]. 系统工程与电子技术, 2023, 45(6): 1722-1731.
[6]	赵庆媛, 赵志强, 叶春茂, 鲁耀兵. 基于自注意力的双波段预警雷达微动融合识别[J]. 系统工程与电子技术, 2023, 45(3): 708-716.
[7]	曹鹏宇, 杨承志, 陈泽盛, 王露, 石礼盟. 基于深度残差收缩注意力网络的雷达信号识别方法[J]. 系统工程与电子技术, 2023, 45(3): 717-725.
[8]	汪锐, 张天骐, 安泽亮, 王雪怡, 方竹. 基于联合特征参数和一维CNN的MIMO-OFDM系统调制识别算法[J]. 系统工程与电子技术, 2023, 45(3): 902-912.
[9]	闫啸家, 梁伟阁, 张钢, 佘博, 田福庆. 基于RCNN-ABiLSTM的机械设备剩余寿命预测方法[J]. 系统工程与电子技术, 2023, 45(3): 931-940.
[10]	宋波涛, 许广亮. 基于LSTM与1DCNN的导弹轨迹预测方法[J]. 系统工程与电子技术, 2023, 45(2): 504-512.
[11]	齐城慧, 张登银. 基于感知融合机制的渐进式去雾网络[J]. 系统工程与电子技术, 2023, 45(11): 3419-3427.
[12]	孙隽丰, 李成海, 曹波. 基于TCN-BiLSTM的网络安全态势预测[J]. 系统工程与电子技术, 2023, 45(11): 3671-3679.
[13]	詹珩艺, 李亚超, 武春风, 宋炫, 张廷豪. 弹载双基前视成像雷达解析-迭代定位方法[J]. 系统工程与电子技术, 2023, 45(1): 71-78.
[14]	宋爽, 张悦, 张琳娜, 岑翼刚, 李浥东. 基于深度学习的轻量化目标检测算法[J]. 系统工程与电子技术, 2022, 44(9): 2716-2725.
[15]	韩啸, 陈世文, 陈蒙, 杨锦程. 基于互易点学习的LPI信号开集识别[J]. 系统工程与电子技术, 2022, 44(9): 2752-2759.