基于多特征正交约束的无人机跨视角地理定位方法

doi:10.12305/j.issn.1001-506X.2026.06.27

系统工程与电子技术 ›› 2026, Vol. 48 ›› Issue (6): 2072-2080.doi: 10.12305/j.issn.1001-506X.2026.06.27

基于多特征正交约束的无人机跨视角地理定位方法

刘瑞航¹, 刘海颖¹^,²^,*, 刘宇辰¹, 陈晨¹, 李铁香²^,³

1. 南京航空航天大学航天学院，江苏南京 211106
2. 南京应用数学中心，江苏南京 210018
3. 东南大学数学学院，江苏南京 210096

收稿日期:2025-03-03 修回日期:2025-04-02 出版日期:2026-06-25 发布日期:2025-06-10
通讯作者: 刘海颖
作者简介:刘瑞航（2001—），男，硕士研究生，主要研究方向为深度学习、视觉导航、遥感与地理信息系统
刘宇辰（2000—），男，硕士研究生，主要研究方向为导航制导与控制
陈　晨（2001—），男，硕士研究生，主要研究方向为多域联合作战规划
李铁香（1979—），女，教授，博士，主要研究方向为大规模矩阵计算
基金资助:
国家自然科学基金(12371377)资助课题

Diverse features orthogonal constraint-based cross-view geo-localization method for UAVs

Ruihang LIU¹, Haiying LIU¹^,²^,*, Yuchen LIU¹, Chen CHEN¹, Tiexiang LI²^,³

1. College of Astronautics，Nanjing University of Aeronautics and Astronautics，Nanjing 211106，China
2. Nanjing Center for Applied Mathematics，Nanjing 210018，China
3. School of Mathematics，Southeast University，Nanjing 210096，China

Received:2025-03-03 Revised:2025-04-02 Online:2026-06-25 Published:2025-06-10
Contact: Haiying LIU

摘要/Abstract

摘要：

在有限的训练数据下，通过深度学习来学习更具判别性的特征仍是一个挑战。本文提出一种基于Transformer的自适应正交约束神经网络，通过将原始空间划分为多个子空间来增加嵌入空间的密度，学习更深层次的特征。基于正交约束的思想，提出了一种重叠的子空间分离的方法，从而减少同质信息。同时，动态调整子空间之间的权重，优化因相似性而难以训练的样本。分别在University-1652数据集、实测数据和不同天气条件模拟数据上进行了实验，结果表明，与基线算法相比，本文提出的算法在定位和导航任务中的性能分别提升了29.58%/26.52%（R@1/AP）和21.51%/27.98%（R@1/AP）。尽管训练数据有限，本文提出的模型仍能学习到更鲁棒和更具判别性的特征表征，从而促进相似类别的识别，帮助无人机实现准确的定位和导航。

关键词: 无人机, 跨视角地理定位, 视觉Transformer, 正交约束, 图像检索

Abstract:

Under limited training data, learning more discriminative features through deep learning remains a challenging task. In this paper, we propose an adaptive orthogonal constraint neural network based on Transformer. This approach aims to enhance the density of the embedding space by dividing the original space into multiple subspaces, thereby facilitating the learning of deeper features. Based on the concept of orthogonal constraints, a method for overlapping subspace separation is proposed, thereby reducing homogeneous information. Furthermore, the weights between subspaces are dynamically adjusted to optimize the samples that are difficult to train due to similarity. Experiments are conducted on the University-1652 dataset, real word measured data and simulated data under different weather conditions, and the results show that the algorithm proposed in this paper improves the performance in localization and navigation tasks by 29.58%/26.52% （R@1/AP） and 21.51%/27.98% （R@1/AP）, respectively, compared with the baseline algorithm. Notwithstanding the constrained training data, the model proposed in this paper learns more robust and discriminative feature representations, facilitating the identification of similar classes and aiding unmanned aerial vehicle in achieving precise localization and navigation.

Key words: unmanned aerial vehicles （UAVs）, cross-view geo-localization, visual Transformer, orthogonal constraints, image retrieval

中图分类号:

TP 391.41

刘瑞航, 刘海颖, 刘宇辰, 陈晨, 李铁香. 基于多特征正交约束的无人机跨视角地理定位方法[J]. 系统工程与电子技术, 2026, 48(6): 2072-2080.

Ruihang LIU, Haiying LIU, Yuchen LIU, Chen CHEN, Tiexiang LI. Diverse features orthogonal constraint-based cross-view geo-localization method for UAVs[J]. Systems Engineering and Electronics, 2026, 48(6): 2072-2080.

图/表 14

图1

图2

表1

图3

图4

图5

表2

图6

图7

表3

表4

图8

图9

表5

参考文献 30

1	LONG Y, GONG Y P, XIAO Z F, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Trans. on Geoscience and Remote Sensing, 2017, 55 (5): 2486- 2498. doi: 10.1109/TGRS.2016.2645610
2	LING G, DRAGHIC N. Aerial drones for blood delivery[J]. Transfusion, 2019, 59 (S2): 1608- 1611. doi: 10.1111/trf.15195
3	KIM D K, WALTER M R. Satellite image-based localization via learned embeddings[C]//Proc. of the IEEE International Conference on Robotics and Automation, 2017: 2073−2080.
4	盛磊, 时满红, 亓迎川, 等. 基于态势演化博弈的无人机集群动态攻防[J]. 系统工程与电子技术, 2023, 45 (8): 2332- 2342.
	SHENG L, SHI M H, QI Y C, et al. Dynamic offense and defense of UAV swarm based on situation evolution game[J]. Systems Engineering and Electronics, 2023, 45 (8): 2332- 2342.
5	JI S P, WEI S Q, LU M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Trans. on Geoscience and Remote Sensing, 2019, 57 (1): 574- 586. doi: 10.1109/TGRS.2018.2858817
6	TIAN X Y, SHAO J, OUYANG D Q, et al. UAV-satellite view synthesis for cross-view geo-localization[J]. IEEE Trans. on Circuits and Systems for Video Technology, 2022, 32 (7): 4804- 4815. doi: 10.1109/TCSVT.2021.3121987
7	DAI M, HU J H, ZHUANG J D, et al. A transformer-based feature segmentation and region alignment method for UAV-view geo-localization[J]. IEEE Trans. on Circuits and Systems for Video Technology, 2022, 32 (7): 4376- 4389. doi: 10.1109/TCSVT.2021.3135013
8	刘瑞康, 卢俊, 郭海涛, 等. 基于多尺度特征聚合的轻量化跨视角匹配定位方法[J]. 地球信息科学学报, 2025, 27 (1): 193- 206. doi: 10.12082/dqxxkx.2025.240538
	LIU R K, LU J, GUO H T, et al. A lightweight cross-view image localization method based on multi-scale feature aggregation[J]. Journal of Geo-information Science, 2025, 27 (1): 193- 206. doi: 10.12082/dqxxkx.2025.240538
9	盛怡宁, 赵理君, 张正, 等. 跨视角图像地理定位方法综述[J]. 中国图象图形学报, 2024, 29 (9): 2716- 2736. doi: 10.11834/jig.230585
	SHENG Y N, ZHAO L J, ZHANG Z, et al. Review of cross-view image geolocalization methods[J]. Journal of Image and Graphics, 2024, 29 (9): 2716- 2736. doi: 10.11834/jig.230585
10	WORKMAN S, JACOBS N. On the location dependence of convolutional neural network features[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015: 70−78.
11	VO N N, HAYS J. Localizing and orienting street views using overhead imagery[C]//Proc. of the European Conference on Computer Vision, 2016: 494−509.
12	HU S, FENG M, NGUYEN R M H, et al. CVM-Net: cross-view matching network for image-based ground-to-aerial geo-localization[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7258−7267.
13	ARANDJELOVIC R, GRONAT P, TORII A, et al. NetVLAD: CNN architecture for weakly supervised place recognition[J]. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2018, 40 (6): 1437- 1451. doi: 10.1109/TPAMI.2017.2711011
14	GUO X P, MENG L Y, MEI L Y, et al. Multi-focus image fusion with Siamese self-attention network[J]. IET Image Processing, 2020, 14 (7): 1339- 1346. doi: 10.1049/iet-ipr.2019.0883
15	GUAN P Y, CAO Z Q, YU J Z, et al. Scene coordinate regression network with global context-guided spatial feature transformation for visual relocalization[J]. IEEE Robotics and Automation Letters, 2021, 6 (3): 5737- 5744. doi: 10.1109/LRA.2021.3082473
16	CAI Z W, FAN Q F, FERIS R S, et al. A unified multi-scale deep convolutional neural network for fast object detection[C]// Proc. of the European Conference on Computer Vision, 2016: 354−370.
17	WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7794−7803.
18	YANG H J, LU X F, ZHU Y Y. Cross-view geo-localization with layer-to-layer transformer[C]//Proc. of the Advances in Neural Information Processing Systems, 2021: 29009−29020.
19	GONG N Q, LI L W, SHA J J, et al. A satellite-drone image cross-view geolocalization method based on multi-scale information and dual-channel attention mechanism[J]. Remote Sensing, 2024, 16 (6): 941. doi: 10.3390/rs16060941
20	LI W, ZOU C, WANG M, et al. DC-former: diverse and compact transformer for person re-identification[C]//Proc. of the AAAI Conference on Artificial Intelligence, 2023: 1415−1423.
21	CUI T Y, LI J Z, DONG Y H, et al. TAOTF: a two-stage approximately orthogonal training framework in deep neural networks[C]//Proc. of the 26th European Conference on Artificial Intelligence, 2023: 509−516.
22	SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: a unified embedding for face recognition and clustering[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 815−823.
23	SAXENA U. Automold [EB/OL]. [2025-02-08]. https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library.
24	ZHENG Z D, WEI Y C, YANG Y. University-1652: a multi-view multi-source benchmark for drone-based geo-localization[C]//Proc. of the 28th ACM International Conference on Multimedia, 2020: 1395−1403.
25	ZHENG Z D, ZHENG L, GARRETT M, et al. Dual-path convolutional image-text embeddings with instance loss[J]. ACM Transactions on Multimedia Computing Communications and Applications, 2020, 16 (2): 51.
26	RADENOVIC F, TOLIAS G, CHUM O. Fine-tuning CNN image retrieval with no human annotation[J]. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2019, 41 (7): 1655- 1668. doi: 10.1109/TPAMI.2018.2846566
27	DING L R, ZHOU J, MENG L X, et al. A practical cross-view image matching method between UAV and satellite for UAV-based geo-localization[J]. Remote Sensing, 2021, 13 (1): 47.
28	WANG T Y, ZHENG Z D, YAN C G, et al. Each part matters: local patterns facilitate cross-view geo-localization[J]. IEEE Trans. on Circuits and Systems for Video Technology, 2022, 32 (2): 867- 879. doi: 10.1109/TCSVT.2021.3061265
29	ZHUANG J D, CHEN X Y, DAI M, et al. A semantic guidance and transformer-based matching method for UAVs and satellite images for UAV geo-localization[J]. IEEE Access, 2022, 10, 34277- 34287. doi: 10.1109/ACCESS.2022.3162693
30	BUI D V, KUBO M, SATO H. A part-aware attention neural network for cross-view geo-localization between UAV and satellite[J]. Journal of Robotics, Networking and Artificial Life, 2022, 9 (3): 275- 284.

方法	无人机→卫星		卫星→无人机
方法	R@1	AP	R@1	AP
University-1652^[24]	58.49	63.13	71.18	58.74
Instance Loss^[25]	58.23	62.91	74.47	59.45
Instance + GeM Pooling^[26]	65.32	69.61	79.03	65.35
LCM^[27]	66.65	70.82	79.89	65.38
LPN^[28]	75.93	79.14	86.45	74.79
SGM^[29]	82.14	84.72	88.16	81.81
FSRA^[1]	84.51	86.71	88.45	83.47
PAAN^[30]	84.51	86.78	91.01	82.28
MIFT^[19]	87.84	89.62	92.30	87.66
本文方法	88.07	89.65	92.69	86.72

算法	无人机→卫星		卫星→无人机
算法	R@1	AP	R@1	AP
无DFOC	90.27	85.15	85.36	87.25
DFOC	92.69	86.72	88.07	89.65

算法	无人机→卫星		卫星→无人机
算法	R@1	AP	R@1	AP
无DW	91.73	84.99	86.27	88.20
DW	92.69	86.72	88.07	89.65

权重	卫星→无人机		无人机→卫星
权重	R@1	AP	R@1	AP
0.1	91.04	86.14	85.91	87.73
0.2	91.61	85.88	87.71	89.36
0.3	92.69	86.72	88.07	89.65
0.4	91.45	85.22	86.70	88.58

任务	R@1	AP
模拟数据→卫星影像	84.62	87.14
模拟数据→无人机影像	91.67	89.49

基于多特征正交约束的无人机跨视角地理定位方法

Diverse features orthogonal constraint-based cross-view geo-localization method for UAVs

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 30

相关文章 15

编辑推荐

Metrics

本文评价

[1]	张国庆, 姚桂鹏, 李纪强. 无人机/船协同降落的鲁棒自适应模糊控制[J]. 系统工程与电子技术, 2026, 48(6): 2081-2088.
[2]	林文斌, 时晨光, 严牧, 汪飞, 周建江. 面向欺骗干扰组网雷达的无人机集群稳健航迹规划算法[J]. 系统工程与电子技术, 2026, 48(5): 1551-1563.
[3]	杨秀霞, 姚文强, 张毅, 于浩. 通信约束下多无人机协同搜索航迹优化[J]. 系统工程与电子技术, 2026, 48(5): 1715-1727.
[4]	胡杰, 褚瑞峰, 朱倚娴, 陈平, 鲍帆. 基于VPPSO算法的无人机三维航迹规划[J]. 系统工程与电子技术, 2026, 48(5): 1738-1751.
[5]	武愈涵, 吴晓莉, 张欣悦, 晏彪, 王名珺. 面向认知增强的MUM-T态势图视觉调控方法[J]. 系统工程与电子技术, 2026, 48(4): 1292-1302.
[6]	杨志, 郁丰, 林思颖, 周紫君. 基于视觉协同的蜂群自主导航技术[J]. 系统工程与电子技术, 2026, 48(4): 1396-1403.
[7]	谷旭平, 史贤俊. 基于结构分析与树种优化算法的无人机可重构性分析与设计[J]. 系统工程与电子技术, 2026, 48(3): 932-945.
[8]	杨跃能, 孔希, 张士峰, 邓少永. 低空小型旋翼无人机近距拦截捕获方法[J]. 系统工程与电子技术, 2026, 48(3): 1010-1017.
[9]	魏建林, 林彦超, 唐慧龙, 张旺, 王伟. 基于改进MCTS的多无人机多任务联合决策[J]. 系统工程与电子技术, 2026, 48(2): 556-568.
[10]	晏彪, 吴晓莉, 张蓝, 刘潇, 方泽茜, 韩炜毅, 李琦桉. 有人/无人机协同指挥员的事件相关电位特征[J]. 系统工程与电子技术, 2026, 48(2): 578-587.
[11]	张森, 庞岩, 周福亮. 基于改进Informed-RRT^*算法的无人机三维路径规划[J]. 系统工程与电子技术, 2026, 48(2): 660-668.
[12]	乔毅涛, 李爽. 空地异构无人系统固定时间事件触发编队包含控制[J]. 系统工程与电子技术, 2026, 48(2): 669-683.
[13]	李淑凤, 韩璐羽. 面向飞机表面巡检的多无人机覆盖路径规划[J]. 系统工程与电子技术, 2026, 48(2): 684-693.
[14]	杨许鑫, 季薇. 无人机辅助的安全MEC系统中的能耗优化策略[J]. 系统工程与电子技术, 2026, 48(2): 719-726.
[15]	林志康, 刘甲磊, 马佳智, 施龙飞, 徐进宝. 利用分布式辐射源闪烁诱偏的抗反辐射方法[J]. 系统工程与电子技术, 2026, 48(1): 1-11.