基于隐式对手建模的策略重构抗智能干扰方法

doi:10.12305/j.issn.1001-506X.2025.04.32

摘要/Abstract

摘要：

随着人工智能技术的不断进步, 智能干扰严重威胁了无线信号的传输, 传统抗干扰算法应对能力不足。基于上述问题, 以强化学习算法为基础, 引入隐式对手建模技术, 将干扰智能体策略隐式编码于神经网络输入中, 经神经网络决策决定通信频点。针对智能干扰策略的非平稳特性, 监测收益趋势识别干扰策略是否切换, 并提出策略重构技术, 利用多尺度窗口检测经验失效起始点, 摒弃失效经验, 同时重置学习率以加速神经网络的收敛速度。实验结果表明, 在相对收敛阶段, 所提方法的传输成功率相比于深度Q网络抗干扰方法提高25%以上。

关键词: 抗干扰, 智能干扰, 深度强化学习, 趋势检测, 隐式对手建模

Abstract:

With the continuous advancement of artificial intelligence technology, intelligent jamming seri-ously threatens wireless signal transmission, and traditional anti-jamming algorithms are insufficient. Based on the above issue, using reinforcement learning algorithms as foundation, implicit opponent modeling techniques are introduced, encoding the jamming agent's strategy implicitly in the neural network input and determining communication frequencies through neural network decisions. In response to the non-stationary nature of intelligent jamming strategies, profit trends are monitored to identify whether jamming strategies are switching and strategy reconstruction technology is proposed. Multi-scale window detection is utilized to identify the start point of experiential failure and discard failed experiences. Learning rate is simultaneously reset to accelerate the convergence speed of the neural network. Experimental results demonstrate that during the relative convergence phase, the proposed method's transmission success rate is increased by over 25% compared to the deep Q-network anti-jamming method.

Key words: anti-jamming, intelligent jamming, deep reinforcement learning, trend detection, implicit opponent modeling

中图分类号:

TN795

马鹏, 蒋睿, 王斌, 徐盟飞, 侯长波. 基于隐式对手建模的策略重构抗智能干扰方法[J]. 系统工程与电子技术, 2025, 47(4): 1355-1363.

Peng MA, Rui JIANG, Bin WANG, Mengfei XU, Changbo HOU. Strategy reconstruction for resilience against intelligence jamming based on implicit opponent modeling[J]. Systems Engineering and Electronics, 2025, 47(4): 1355-1363.

图/表 10

图1

图2

图3

表1

图4

图5

图6

图7

图8

图9

参考文献 31

1	WANGX M,WANGJ L,XUY H,et al.Dynamic spectrum anti-jamming communications: challenges and opportunities[J].IEEE Communications Magazine,2020,58(2):79-85. doi: 10.1109/MCOM.001.1900530
2	ZHOUQ,NIUY T,XIANGP,et al.Intra-domain knowledge reuse assisted reinforcement learning for fast anti-jamming communication[J].IEEE Trans.on Information Forensics and Security,2023,18,4707-4720. doi: 10.1109/TIFS.2023.3284611
3	RUBYR,YANGH L,WUK S.Anti-jamming strategy for federated learning in Internet of medical things: a game approach[J].IEEE Journal of Biomedical and Health Informatics,2022,27(2):888-899.
4	JAITLY S, MALHOTRA H, BHUSHAN B. Security vulnerabilities and countermeasures against jamming attacks in wireless sensor networks: a survey[C]//Proc. of the International Conference on Computer, Communications and Electronics, 2017: 559-564.
5	MACHUZAK S, JAYAWEERA S K. Reinforcement learning based anti-jamming with wideband autonomous cognitive radios[C]//Proc. of the IEEE/CIC International Conference on Communications in China, 2016.
6	潘筱茜,张姣,刘琰,等.基于深度强化学习的多域联合干扰规避[J].信号处理,2022,38(12):2572-2581.
	PANX Q,ZHANGJ,LIUY,et al.Multi domain joint interference avoidance based on deep reinforcement learning[J].Signal Processing,2022,38(12):2572-2581.
7	SLIMENIF,CHTOUROUZ,SCHEERSB,et al.Cooperative Q-learning based channel selection for cognitive radio networks[J].Wireless Networks,2019,25,4161-4171. doi: 10.1007/s11276-018-1737-9
8	LIUX,XUY H,JIAL L,et al.Anti-jamming communications using spectrum waterfall: a deep reinforcement learning approach[J].IEEE Communications Letters,2018,22(5):998-1001. doi: 10.1109/LCOMM.2018.2815018
9	吴志娟,林艳,张一晋,等.基于多智能体协同的无人机簇群多域节能抗干扰通信[J].中国科学: 信息科学,2023,53(12):2511-2526.
	WUZ J,LINY,ZHANGY J,et al.Multi-agent collaboration based UAV clusters multi-domain energy-saving anti-jamming communication[J].SCIENTIA SINICA Informationis,2023,53(12):2511-2526.
10	SILVERD,HUANGA,MADDISONC J,et al.Mastering the game of go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489. doi: 10.1038/nature16961
11	NGUYEN P K H, NGUYEN V H, DO V L. A deep double-Q learning-based scheme for anti-jamming communications[C]//Proc. of the 28th European Signal Processing Conference, 2021: 1566-1570.
12	VANH N,HOANGD T,NGUYEND N,et al.DeepFake: deep dueling-based deception strategy to defeat reactive jammers[J].IEEE Trans.on Wireless Communications,2021,20(10)):6898-6914.
13	XIAOL,ZHANGH L,XIAOY L,et al.Reinforcement learning-based downlink interference control for ultra-dense small cells[J].EEE Trans.on Wireless Communications,2020,19(1):423-434. doi: 10.1109/TWC.2019.2945951
14	YANGH L,ZHAOJ,LAMK Y,et al.Distributed deep reinforcement learning-based spectrum and power allocation for heterogeneous networks[J].IEEE Trans.on Wireless Communications,2022,21(9):6935-6948. doi: 10.1109/TWC.2022.3153175
15	XIAOL,DINGY Z,HUANGJ H,et al.UAV anti-jamming video transmissions with QoE guarantee: a reinforcement learning-based approach[J].IEEE Trans.on Communications,2021,69(9):5933-5947. doi: 10.1109/TCOMM.2021.3087787
16	LUX Z,XIAOL,DAIC H,et al.UAV-aided cellular communications with deep reinforcement learning against jamming[J].IEEE Wireless Communications,2020,27(4):48-53. doi: 10.1109/MWC.001.1900207
17	FANC Q,LIUH Y,LIB,et al.Adversarial game against hybrid attacks in UAV communications with partial information[J].IEEE Trans.on Vehicular Technolog,2021,71(2):2204-2208.
18	NOORIH,SADEGHIV S.Jamming and anti-jamming in interference channels: a stochastic game approach[J].IET Communications,2020,14(4):682-692. doi: 10.1049/iet-com.2019.0637
19	WANGB B,WUY L,LIUK J R,et al.An anti-jamming stochastic game for cognitive radio networks[J].IEEE Journal on Selected Areas in Communications,2011,29(4):877-889. doi: 10.1109/JSAC.2011.110418
20	LIW,XUY H,CHENJ,et al.Know the enemy: an opponent modeling-based anti-intelligent jamming strategy beyond equilibrium solutions[J].IEEE Wireless Communications Letters,2022,12(2):217-221.
21	AL-SHEDIVAT M, BANSAL T, BURDA Y, et al. Continuous adaptation via meta-learning in nonstationary and competitive environments[EB/OL]. [2014-02-01]. https://arxiv.org/abs/1710.03641.
22	THUENTE D, ACHARYA M. Intelligent jamming in wireless networks with applications to 802.11 b and other networks[C]// Proc. of the IEEE Military Communications Conference, 2006.
23	GLEAVE A, DENNIS M, WILD C, et al. Adversarial policies: attacking deep rein-forcement learning[EB/OL]. [2014-02-01]. https://arxiv.org/abs/1905.10615.
24	LIY Y,WANGX M,LIUD X,et al.On the performance of deep reinforcement learning-based anti-jamming method confronting intelligent jammer[J].Applied Sciences,2019,9(7):1361. doi: 10.3390/app9071361
25	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[EB/OL]. [2014-02-01]. https://arxiv.org/abs/1312.5602.
26	VAN H H, GUEZ A, SILVER D. Deep reinforcement learning with double Q -learning[C]//Proc. of the AAAI Conference on Artificial Intelligence, 2016.
27	WANG Z Y, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[C]//Proc. of the International Conference on Machine Learning, 2016: 1995-2003.
28	HE H, BOYD-GRABER J, KWOK K, et al. Opponent modeling in deep reinforcement learning[C]//Proc. of the International Conference on Machine Learning, 2016: 1804-1813.
29	赵天昊. 动态博弈环境下的隐式对手建模方法[D]. 大连: 大连理工大学, 2021.
	ZHAO T H. A method for modeling implicit opponents in dynamic game environments[D]. Dalian: Dalian University of Technology, 2021.
30	EYRE-WALKERA.Changing effective population size and the McDonald-Kreitman test[J].Genetics,2002,162(4):2017-2024. doi: 10.1093/genetics/162.4.2017
31	SLIMENIF,SCHEERSB,CHTOUROUZ,et al.Cognitive radio jamming mitigation using Markov decision process and reinforcement learning[J].Procedia Computer Science,2015,73,199-208.

参数名称	数值
经验抽取数量	64
最大学习率	6e-3
最小学习率	2e-3
折扣因子γ	0.965
探索因子ε	0.07
目标网络更新轮数	50
全连接层1神经元数	32
全连接层2神经元数	32
全连接层3神经元数	10
全连接层4神经元数	1

[1]	赵正义, 南普龙. 基于短时傅里叶变换的高效空频自适应抗干扰方法[J]. 系统工程与电子技术, 2025, 47(6): 1778-1785.
[2]	孟麟芝, 孙小涓, 胡玉新, 高斌, 孙国庆, 牟文浩. 面向卫星在轨处理的强化学习任务调度算法[J]. 系统工程与电子技术, 2025, 47(6): 1917-1929.
[3]	郑康洁, 张新宇, 王伟菘, 刘震生. DQN与规则结合的智能船舶动态自主避障决策[J]. 系统工程与电子技术, 2025, 47(6): 1994-2001.
[4]	刘书含, 李彤, 李富强, 杨春刚. 意图态势双驱动的数据链抗干扰通信机制[J]. 系统工程与电子技术, 2025, 47(6): 2055-2064.
[5]	贾金伟, 高敏, 韩壮志, 刘利民, 尹园威. 无线电近炸引信抗信息型干扰技术研究综述[J]. 系统工程与电子技术, 2025, 47(4): 1074-1107.
[6]	王子怡, 傅雄军, 董健, 冯程. 基于分层多智能体强化学习的雷达协同抗干扰策略优化[J]. 系统工程与电子技术, 2025, 47(4): 1108-1114.
[7]	熊威, 张栋, 任智, 杨书恒. 面向有人/无人机协同打击的智能决策方法研究[J]. 系统工程与电子技术, 2025, 47(4): 1285-1299.
[8]	陈世龙, 刘霖, 王晓蓓, 曾翰森, 刘亚波, 刘翔. 脉间捷变频SAR快速补偿频域成像处理算法[J]. 系统工程与电子技术, 2025, 47(3): 797-806.
[9]	唐开强, 傅汇乔, 刘佳生, 邓归洲, 陈春林. 基于深度强化学习的带约束车辆路径分层优化研究[J]. 系统工程与电子技术, 2025, 47(3): 827-841.
[10]	陈夏瑢, 李际超, 陈刚, 刘鹏, 姜江. 基于异质网络的装备体系组合发展规划问题[J]. 系统工程与电子技术, 2025, 47(3): 855-861.
[11]	于雷, 刘一品, 位寅生. 基于信干噪比最大化的盲提取抗主瓣干扰方法[J]. 系统工程与电子技术, 2024, 46(9): 2968-2979.
[12]	张庭瑜, 曾颖, 李楠, 黄洪钟. 基于深度强化学习的航天器功率-信号复合网络优化算法[J]. 系统工程与电子技术, 2024, 46(9): 3060-3069.
[13]	夏雨奇, 黄炎焱, 陈恰. 基于深度Q网络的无人车侦察路径规划[J]. 系统工程与电子技术, 2024, 46(9): 3070-3081.
[14]	杨志鹏, 陈子浩, 曾长, 林松, 毛金娣, 张凯. 复杂环境下的飞行器在线航路规划决策方法[J]. 系统工程与电子技术, 2024, 46(9): 3166-3175.
[15]	郭宏达, 娄静涛, 徐友春, 叶鹏, 李永乐, 陈晋生. 基于MADDPG的多无人车协同事件触发通信[J]. 系统工程与电子技术, 2024, 46(7): 2525-2533.