基于深度Q学习的组网雷达闪烁探测调度方法

doi:10.12305/j.issn.1001-506X.2025.05.07

摘要/Abstract

摘要：

组网雷达闪烁探测体制可以提高雷达的协同探测性能和生存率, 选择合适的雷达协同探测开机并限制单部雷达的开机暴露时间适应不断变化的环境威胁是亟待解决的问题。对此，提出一种基于深度Q学习(deep Q-learning, DQL)强化学习算法的限制单部雷达开机时间的组网雷达闪烁探测调度方法。首先建立空中干扰机对组网雷达的威胁度模型和雷达对空中干扰机的组网雷达闪烁探测模型；然后提出威胁度、组网瞬时探测概率强化学习奖励函数；最后利用提出的DQL算法求取组网雷达最佳闪烁开机决策调度方案。仿真结果表明, 所提DQL调度方法平均效益率均优于随机调度、人工蜂群调度、双深度Q网络调度方法, 且调度响应耗时较少。

关键词: 组网雷达, 闪烁探测, 强化学习, 深度Q学习, 双深度Q网络

Abstract:

The netted radar scintillation detection system can improve the cooperative detection performance and survival rate of radar. It is an urgent problem to select a suitable radar cooperative detection startup and limit the startup exposure time of a single radar to adapt to the ever-changing environmental threats. In this regard, a netted radar scintillation detection scheduling method is presented based on deep Q-learning (DQL) reinforcement learning algorithm to limit the startup time of a single radar. Firstly, the threat degree model of the air jammer to the netted radar and the scintillation detection model of the netted radar to the air jammer are established. Then, the reinforcement learning reward function of the threat degree and the netted scintillation detection probability is proposed. Finally, the optimal scintillation startup decision scheduling scheme of the netted radar is obtained by using the proposed DQL algorithm. The simulation results show that the average benefit rate of the proposed DQL scheduling method is superior to random scheduling, artificial bee colony scheduling and double deep Q network(DDQN) scheduling methods, and the scheduling response time is less.

Key words: netted radar, scintillation detection, reinforcement learning, deep Q-learning (DQL), double deep Q network (DDQN

中图分类号:

TN973

林志康, 施龙飞, 刘甲磊, 马佳智. 基于深度Q学习的组网雷达闪烁探测调度方法[J]. 系统工程与电子技术, 2025, 47(5): 1443-1452.

Zhikang LIN, Longfei SHI, Jialei LIU, Jiazhi MA. Scintillation detection scheduling method of netted radar based on deep Q-learning[J]. Systems Engineering and Electronics, 2025, 47(5): 1443-1452.

图/表 15

表1

图1

表2

表3

图2

图3

表4

表5

图4

图5

图6

图7

图8

图9

表6

参考文献 28

1	FARINA A. Electronic counter-countermeasures[M]. 3rd ed. SKOLNIK M, ed. New York: McGraw-Hill, 2008.
2	YANG Z P , YANG S N , ZHOU Q S , et al. A joint optimization algorithm for focused energy delivery in precision electronic warfare[J]. Defence Technology, 2022, 18 (4): 709- 721. doi: 10.1016/j.dt.2021.03.001
3	LI S X , LIU G Y , ZHANG K , et al. DRL-based joint path planning and jamming power allocation optimization for suppressing netted radar system[J]. IEEE Signal Processing Letters, 2023, 30 (4): 548- 552.
4	ZHAO Z , LI X L , ZHANG Z R , et al. Optimal placement method of netted MIMO radar nodes based on hybrid integration for surveillance applications[J]. IEEE Trans. on Aerospace and Electronic Systems, 2024, 60 (3): 3537- 3552. doi: 10.1109/TAES.2024.3368378
5	YAN J K , ZHANG T , MA L , et al. Deployment optimization for integrated search and tracking tasks in netted radar system based on Pareto theory[J]. IEEE Trans. on Aerospace and Electronic Systems, 2024, 60 (3): 3664- 3672. doi: 10.1109/TAES.2024.3367662
6	孙兵, 李龙骧, 罗景青. 协同侦察系统增加猝发探测功能的定位技术[J]. 航天电子对抗, 2016, 32 (3): 9- 12.
	SUN B , LI L X , LUO J Q . Location technology of synergy reconnaissance system adding instantaneous detection function[J]. Aerospace Electronic Warfare, 2016, 32 (3): 9- 12.
7	高石印, 石玮, 王钦, 等. 地对空雷达干扰机布阵与开机时序控制研究[J]. 空军预警学院学报, 2020, 34 (5): 346-350, 355.
	GAO S Y , SHI W , WANG Q , et al. Research on ground-to-air radar jammer embattling and jamming time sequence control[J]. Journal of Air Force Early Warning Academy, 2020, 34 (5): 346-350, 355.
8	陈兴凯, 韩壮志, 封吉平, 等. 基于跟踪精度的火控雷达网间歇开机控制策略[J]. 探测与控制学报, 2013, 35 (5): 74- 78.
	CHEN X K , HAN Z Z , FENG J P , et al. Intermittent control strategy of fire-control radar network based on tracking accuracy[J]. Journal of Detection & Control, 2013, 35 (5): 74- 78.
9	SMITH G E , CAMMENGA Z , MITCHELL A , et al. Experiments with cognitive radar[J]. IEEE Aerospace and Electronic Systems Magazine, 2016, 31 (12): 34- 46. doi: 10.1109/MAES.2016.150215
10	MORCERF L A, KONTSON K R, CLEMA J K. A concept for the application of artificial intelligence technology to battlefield spectrum management[C]//Proc. of the IEEE Military Communications Conference-Crisis Communications, 1987: 161-166.
11	MOUSAVI S S, SCHUKAT M, HOWLEY E. Deep reinforcement learning: an overview[C]//Proc. of the SAI Intelligent Systems Conference, 2016: 426-440.
12	SUTTON R S , BARTO A G . Reinforcement learning: an introduction[J]. Robotica, 1999, 17 (2): 229- 235.
13	GAO Y , CHEN S F , LU X . Research on reinforcement learning technology: a review[J]. Acta Automatica Sinica, 2004, 30 (1): 86- 100.
14	WANG Y H, ZHANG T X, XU L X, et al. Model-free reinforcement learning based multi-stage smart noise jamming[C]//Proc. of the IEEE Radar Conference, 2019.
15	SIDDESHA K , JAYARAMAIAH G V , SINGH C . A novel deep reinforcement learning scheme for task scheduling in cloud computing[J]. Cluster Computing, 2022, 25 (6): 4171- 4188. doi: 10.1007/s10586-022-03630-2
16	YOU C , LU J , FILEV D , et al. Advanced planning for auto-nomous vehicles using reinforcement learning and deep inverse reinforcement learning[J]. Robotics and Autonomous Systems, 2019, 114, 1- 18. doi: 10.1016/j.robot.2019.01.003
17	LI Y J , ZHU Y P , GAO M G . Design of cognitive radar jamming based on Q-learning algorithm[J]. Transactions of Beijing Institute of Technology, 2015, 35 (11): 1194- 1199.
18	TAYLOR G, WAGNER K, RADEMACHER P. Deep Q-network for radar task-scheduling problem[C]//Proc. of the IEEE Radar Conference, 2022.
19	ZHANG W X , ZHAO T , ZHAO Z K , et al. An intelligent strategy decision method for collaborative jamming based on hierarchical multi-agent reinforcement learning[J]. IEEE Trans. on Cognitive Communications and Networking, 2024, 10 (4): 1467- 1480. doi: 10.1109/TCCN.2024.3373640
20	ZHANG H W , XIE J W , ZHANG Z J , et al. Online task interleaving scheduling for the digital array radar[J]. AEU-International Journal of Electronics and Communications, 2017, 79, 250- 256.
21	YAN J K , PU W Q , DAI J H , et al. Resource allocation for search and track application in phased array radar based on Pareto bi-objective optimization[J]. IEEE Trans. on Vehicular Technology, 2019, 68 (4): 3487- 3499. doi: 10.1109/TVT.2019.2894960
22	邢怀玺, 邢清华. 雷达闪烁探测优化调度模型[J]. 北京航空航天大学学报, 2024, 50 (12): 3884- 3893.
	XING H X , XING Q H . An optimal scheduling model for scintillation detection of netted radars[J]. Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (12): 3884- 3893.
23	FENG J F , ZHANG Q , HU J H , et al. Dynamic assessment method of air target threat based on improved GIFSS[J]. Journal of Systems Engineering and Electronics, 2019, 30 (3): 525- 534. doi: 10.21629/JSEE.2019.03.10
24	苏冠霞, 马华强, 李祥, 等. 基于层次分析法的陆基预警雷达固有威胁评估[J]. 电子信息对抗技术, 2021, 36 (4): 87- 91.
	SU G X , MA H Q , LI X , et al. Inherent threat assessment of land-based early warning radar based on analytic hierarchy process[J]. Electronic Information Countermeasure Technology, 2021, 36 (4): 87- 91.
25	CHEN Y F , WU Y , CHEN N , et al. New approximate distributions for the generalized likelihood ratio test detection in passive radar[J]. IEEE Signal Processing Letters, 2019, 26 (5): 685- 689. doi: 10.1109/LSP.2019.2903632
26	RICHTER R, GOMES N A S. A-4 Skyhawk aircraft stealth capa-city against L-band radar based on dynamic target detection[C]// Proc. of the IEEE Radar Conference, 2020.
27	尹康银, 姜志敏, 冯亚军. 预警雷达工作模式分配两阶段优化策略[J]. 兵工学报, 2022, 43 (2): 328- 336.
	YIN K Y , JIANG Z M , FENG Y J . Two-stage optimization strategy of assignment for operating modes of early warning radar[J]. Acta Armamentarii, 2022, 43 (2): 328- 336.
28	JANG B , KIM M , HARERIMANA G , et al. Q-learning algorithms: a comprehensive classification and applications[J]. IEEE Access, 2019, 7, 133653- 133667. doi: 10.1109/ACCESS.2019.2941229

参数	数值
干扰总功率/kW	6.8
增益/dB	30
极化失配损失	0.5
干扰频段/GHz	2~18
系统损耗	0.15

参数	数值
发射功率/kW	1 200
发射增益/接收增益/dB	30/30
脉冲积累数	16
虚警概率	10^－6
扫描周期/s	10
波长/m	0.1
系统损耗	0.15
目标雷达散射截面/m²	1

方法	平均探测率/%	平均威胁度/%	平均效益率/%	时间损耗/s
随机调度	37.26	30.35	30.03	3
ABC调度	56.96	27.86	55.58	13
DDQN调度	66.02	34.56	59.06	872
DQL调度	72.23	25.56	71.37	12

[1]	王子怡, 傅雄军, 董健, 冯程. 基于分层多智能体强化学习的雷达协同抗干扰策略优化[J]. 系统工程与电子技术, 2025, 47(4): 1108-1114.
[2]	熊威, 张栋, 任智, 杨书恒. 面向有人/无人机协同打击的智能决策方法研究[J]. 系统工程与电子技术, 2025, 47(4): 1285-1299.
[3]	马鹏, 蒋睿, 王斌, 徐盟飞, 侯长波. 基于隐式对手建模的策略重构抗智能干扰方法[J]. 系统工程与电子技术, 2025, 47(4): 1355-1363.
[4]	唐开强, 傅汇乔, 刘佳生, 邓归洲, 陈春林. 基于深度强化学习的带约束车辆路径分层优化研究[J]. 系统工程与电子技术, 2025, 47(3): 827-841.
[5]	陈夏瑢, 李际超, 陈刚, 刘鹏, 姜江. 基于异质网络的装备体系组合发展规划问题[J]. 系统工程与电子技术, 2025, 47(3): 855-861.
[6]	付可, 陈浩, 王宇, 刘权, 黄健. 基于不确定性的贝叶斯策略重用方法[J]. 系统工程与电子技术, 2025, 47(2): 535-543.
[7]	刘晓琳, 郭梦娇, 李卓. Dueling DQN优化下的航班延误自适应图卷积循环网络预测方法[J]. 系统工程与电子技术, 2025, 47(2): 568-579.
[8]	赵万兵, 夏元清, 戴荔, 张元. 弱通信下无人潜航器事件触发一致性协同控制[J]. 系统工程与电子技术, 2025, 47(2): 591-599.
[9]	闫循良, 王宽, 张子剑, 王培臣. 基于LSTM-DDPG的再入制导方法[J]. 系统工程与电子技术, 2025, 47(1): 268-279.
[10]	张庭瑜, 曾颖, 李楠, 黄洪钟. 基于深度强化学习的航天器功率-信号复合网络优化算法[J]. 系统工程与电子技术, 2024, 46(9): 3060-3069.
[11]	夏雨奇, 黄炎焱, 陈恰. 基于深度Q网络的无人车侦察路径规划[J]. 系统工程与电子技术, 2024, 46(9): 3070-3081.
[12]	杨志鹏, 陈子浩, 曾长, 林松, 毛金娣, 张凯. 复杂环境下的飞行器在线航路规划决策方法[J]. 系统工程与电子技术, 2024, 46(9): 3166-3175.
[13]	邹玮琦, 牛朝阳, 刘伟, 王艳云, 湛嘉祺. 面向组网雷达干扰任务的多机伴随式编队航迹预规划方法[J]. 系统工程与电子技术, 2024, 46(8): 2807-2819.
[14]	彭莉莎, 孙宇祥, 薛宇凡, 周献中. 融合三支多属性决策与SAC的兵棋推演智能决策技术[J]. 系统工程与电子技术, 2024, 46(7): 2310-2322.
[15]	郭宏达, 娄静涛, 徐友春, 叶鹏, 李永乐, 陈晋生. 基于MADDPG的多无人车协同事件触发通信[J]. 系统工程与电子技术, 2024, 46(7): 2525-2533.