对多功能雷达的DQN认知干扰决策方法

doi:10.3969/j.issn.1001-506X.2020.04.12

系统工程与电子技术 ›› 2020, Vol. 42 ›› Issue (4): 819-825.doi: 10.3969/j.issn.1001-506X.2020.04.12

对多功能雷达的DQN认知干扰决策方法

张柏开^1,²(), 朱卫纲¹()

1. 航天工程大学电子与光学工程系, 北京 101416
2. 航天工程大学研究生院, 北京 101416

收稿日期:2019-07-10 出版日期:2020-03-28 发布日期:2020-03-28
作者简介:张柏开(1995-),男,硕士研究生,主要研究方向为雷达对抗与认知电子战。E-mail:zbk0626@163.com|朱卫纲(1973-),女,教授,博士,主要研究方向为现代信号处理、空间信息对抗、认知电子战。E-mail:yi_yun_hou@163.com
基金资助:
CEMEE国家重点实验室项目(2018Z0202B)

DQN based decision-making method of cognitive jamming against multifunctional radar

Bokai ZHANG^1,²(), Weigang ZHU¹()

1. Department of Electronic and Optical Engineering, Space Engineering University, Beijing 101416, China
2. Department of Graduate Management, Space Engineering University, Beijing 101416, China

Received:2019-07-10 Online:2020-03-28 Published:2020-03-28
Supported by:
CEMEE国家重点实验室项目(2018Z0202B)

摘要/Abstract

摘要：

基于Q-Learning的认知干扰决策方法随着多功能雷达(multifunctional radar, MFR)可执行的任务越来越多,决策效率明显下降。对此,提出了一种对MFR的深度Q神经网络(deep Q network, DQN)干扰决策方法。首先,分析MFR信号特点并构建干扰库,以此为基础研究干扰决策方法。其次,通过对DQN原理的简要阐述,提出了干扰决策方法及其决策流程。最后,对该决策方法进行了仿真试验并通过对比DQN和Q-Learning的决策性能,验证了所提方法的必要性。为提高决策的实时性和准确率,对DQN算法进行了改进,在此基础上,结合先验知识进一步提高了决策效率。仿真试验表明:该决策方法能够较好地自主学习实际战场中的干扰效果,对可执行多种雷达任务的MFR完成干扰决策。

关键词: 多功能雷达, 干扰决策, 深度Q神经网络, 认知电子战, 先验知识

Abstract:

With the increasing number of tasks that can be performed by multifunctional radar (MFR), the decision-making efficiency of Q-Learning based decision-making methods of cognitive jamming is significantly reduced. Aiming at this, a deep Q neural network (DQN) based jamming decision-making method against MFR is proposed. Firstly, the characteristics of MFR signals are analyzed and the jamming library is constructed. Based on this, the jamming decision-making method is studied. Secondly, through the brief explanation of the DQN principle, the jamming decision-making method and its process are proposed. Finally, the simulation test of the decision-making method is carried out and the necessity of the method is verified by comparing the decision-making performance of DQN and Q-Learning. In order to improve the real-time and accuracy of decision-making, the DQN algorithm has been improved. On this basis, combined with prior knowledge, the decision-making efficiency is further improved. The simulation test shows that the decision-making method can learn the jamming effect in the actual battlefield autonomously, and complete the decision-making of cognitive jamming against the MFR that can perform multiple radar tasks.

Key words: multifunctional radar (MFR), jamming decision-making, deep Q neural network (DQN), cognitive electronic warfare, priori knowledge

中图分类号:

TN974

张柏开, 朱卫纲. 对多功能雷达的DQN认知干扰决策方法[J]. 系统工程与电子技术, 2020, 42(4): 819-825.

Bokai ZHANG, Weigang ZHU. DQN based decision-making method of cognitive jamming against multifunctional radar[J]. Systems Engineering and Electronics, 2020, 42(4): 819-825.

图/表 12

表1

图1

表2

图2

图3

图4

图5

图6

图7

图8

图9

图10

参考文献 16

1	DARPA. Behavior learning for adaptive electronic warfare[EB/OL].[2018-03-23]. http://www.fbo.gov.
2	DARPA. Communications under extreme RF spectrum conditions[EB/OL].[2018-05-09]. http://www.fbo.gov.
3	Air Force. Cognitive jammer[EB/OL].[2018-05-09]. http://www.fbo.gov.
4	DARPA. Adaptive radar countermeasures[EB/OL].[2018-06-26]. https://www.fbo.gov.
5	孙宏伟, 童宁宁, 孙富君. 基于D-S证据理论的电子干扰模式选择[J]. 弹箭与制导学报, 2003, 23 (2): 218- 220.
	SUN H W , TONG N N , SUN F J . Jamming design selection based on D-S theory[J]. Journal of Projectiles, Rockets, Missiles and Guidance, 2003, 23 (2): 218- 220.
6	张永顺.复杂电磁环境下基于博弈论的机载雷达对抗仿真研究[D].西安:西安电子科技大学, 2011.
	ZHANG Y S. The research on simulation of airborne radar countermeasures based on game theory in complex electromagnetic environment[D]. Xi'an: Xidian University, 2011.
7	李云杰, 朱云鹏, 高梅国. 基于Q-学习算法的认知雷达对抗过程设计[J]. 北京理工大学学报, 2015, 35 (11): 1194- 1199.
	LI Y J , ZHU Y P , GAO M G . Design of cognitive radar jamming based on Q-Learning algorithm[J]. Transactions of Beijing Institute of Technology, 2015, 35 (11): 1194- 1199.
8	邢强, 贾鑫. 基于Q-学习的智能雷达对抗[J]. 系统工程与电子技术, 2018, 40 (5): 1031- 1035.
	XING Q , JIA X . Intelligent radar countermeasure based on Q-learning[J]. Systems Engineering and Electronics, 2018, 40 (5): 1031- 1035.
9	FARINA A, TIMMONERI L. Live data test of electronic counter-countermeasures (ECCM) on a multifunctional prototype radar[C]//Proc.of the IEEE Metrology for Aerospace, 2016.
10	马爽.多功能雷达电子情报信号处理关键技术研究[D].长沙:国防科技大学, 2013.
	MA S. Research on ELINT signal processing key technologies for multifunction radar[D]. Changsha: National University of Defense Technology, 2013.
11	MNIH V , KAVUKCUOGLU K , SILVER D . Human-level control through deep reinforcement learning[J]. Nature, 2015, 518 (7540): 529- 533. doi: 10.1038/nature14236
12	DUGGAN M , DUGGAN J , BARRETT E . A reinforcement learning approach for the scheduling of live migration from under utilised hosts[J]. Memetic Computing, 2017, 9 (4): 283- 293. doi: 10.1007/s12293-016-0218-x
13	EPPINGER E , WALTER M , SHU C L . Electrophysiological correlates reflect the integration of model-based and model-free decision information[J]. Cognitive, Affective, & Behavioral Neuroscience, 2017, 17 (2): 406- 421.
14	彭伟. 揭秘深度强化学习[M]. 北京: 北京水利水电出版社, 2018: 39- 61.
	PENG W . Reveal secrets of deep reinforcement learning[M]. Beijing: Beijing Water Resources and Hydropower Press, 2018: 39- 61.
15	SAJAD H K , SAEED B S , SOROUSH S K . Path planning of modular robots on various terrains using Q-learning versus optimization algorithms[J]. Intelligent Service Robotics, 2017, 10 (2): 121- 136. doi: 10.1007/s11370-017-0217-x
16	MATTEO H , JOSEPH M , HAD V H , et al. Rainbow: combining improvements in deep reinforcement learning[J]. Nature, 2017, 1- 9.

功能层	任务层	波形单元	波形单元表示						JM
功能层	任务层	波形单元	PRI	CF	PW	MT	Cr	BoxD	JM
1	T₁^A	w₁^A				1			1、2、3、4、5
2	T₂^A	w₂^A				2			6、7、8
2	T₂^B	w₂^B				2			3、9、10
									
10	T₁₀^A	w₁₀^A				5			3、6、9
10	T₁₀^B	w₁₀^B				6			3、6、10
11	T₁₁^A	w₁₁^A				6			6、8、10

任务层	JM
任务层	1	2	3	4	5	6	7	8	9	10	11	12
T₁^A		T₂^B		T₂^A		T₁^A				T₁^A
T₂^A	T₁^A				T₂^A	T₃^A		T₃^B
T₂^B	T₁^A				T₂^B		T₃^C		T₃^A
T₃^A~T₉^C	…	…	…	…	…	…	…	…	…	…	…	…
T₁₀^A		T₁₁^A			T₉^B			T₁₀^A
T₁₀^B				T₉^C		T₁₀^B			T₉^B			T₁₁^A
T₁₁^A

[1]	朱霸坤, 朱卫纲, 李伟, 杨莹, 高天昊. 基于先验知识的多功能雷达智能干扰决策方法[J]. 系统工程与电子技术, 2022, 44(12): 3685-3695.
[2]	阳榴, 朱卫纲, 吕守业, 马爽. 面向非协作多功能雷达的波形单元提取方法[J]. 系统工程与电子技术, 2021, 43(10): 2843-2850.
[3]	张柏开, 朱卫纲. MFR认知干扰决策体系构建及关键技术[J]. 系统工程与电子技术, 2020, 42(9): 1969-1975.
[4]	邢强, 朱卫纲, 贾鑫, 郑光勇. 干扰规则库未知条件下的干扰决策[J]. 系统工程与电子技术, 2019, 41(2): 298-303.
[5]	颛孙少帅, 杨俊安, 刘辉, 黄科举. 基于正强化学习和正交分解的干扰策略选择算法[J]. 系统工程与电子技术, 2018, 40(3): 518-525.
[6]	曹健, 王兆祎, 胡进峰, 何子述. 基于知识辅助的天波雷达海杂波抑制方法[J]. 系统工程与电子技术, 2018, 40(3): 533-537.
[7]	刘汉伟, 张永顺, 王强, 吴亿锋. 基于稀疏重构的机载雷达训练样本挑选方法[J]. 系统工程与电子技术, 2016, 38(7): 1532-1537.
[8]	高晓光,胡明,郑景嵩. 突防任务中的单机对多目标干扰决策[J]. Journal of Systems Engineering and Electronics, 2010, 32(6): 1239-1243.
[9]	周宇，张林让，刘楠，刘昕. 空时自适应处理中基于知识的训练 样本选择策略[J]. Journal of Systems Engineering and Electronics, 2010, 32(2): 405-409.

对多功能雷达的DQN认知干扰决策方法

DQN based decision-making method of cognitive jamming against multifunctional radar

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 16

相关文章 9

编辑推荐

Metrics

本文评价