系统工程与电子技术 ›› 2023, Vol. 45 ›› Issue (9): 2755-2760.doi: 10.12305/j.issn.1001-506X.2023.09.13

• 传感器与信号处理 • 上一篇    下一篇

基于POMDP模型的智能雷达干扰决策方法

冯路为, 刘松涛, 徐华志   

  1. 海军大连舰艇学院信息系统系, 辽宁 大连 116018
  • 收稿日期:2022-01-04 出版日期:2023-08-30 发布日期:2023-09-05
  • 通讯作者: 刘松涛
  • 作者简介:冯路为 (1998—), 男, 硕士研究生, 主要研究方向为电子对抗技术及应用
    刘松涛 (1978—), 男, 副教授, 博士后, 主要研究方向为电子对抗、图像处理、光电工程
    徐华志 (1988—), 男, 硕士研究生, 主要研究方向为电子对抗技术及应用
  • 基金资助:
    中国博士后基金(2015M572694);中国博士后基金(2016T90979)

Intelligent radar jamming decision-making method based on POMDP model

Luwei FENG, Songtao LIU, Huazhi XU   

  1. Department of Information System, Dalian Naval Academy, Dalian 116018, China
  • Received:2022-01-04 Online:2023-08-30 Published:2023-09-05
  • Contact: Songtao LIU

摘要:

为了有效提高复杂电磁环境下对非合作方工作模式未知的智能雷达的干扰效率和准确率, 提出了一种基于部分可观测马尔可夫决策过程(partially observable Markov decision process, POMDP)的干扰决策方法。首先, 根据智能雷达的工作特点构建了智能雷达对抗系统的POMDP模型, 采用非参数的、基于样本的信念分布反映智能体对环境的认知, 并利用贝叶斯滤波更新智能体对环境的信念。然后, 以信息熵作为评估准则, 令干扰机选择信息熵最大的干扰样式不断尝试。最后, 通过仿真实验与传统Q-学习法和经验决策法的干扰决策性能进行比较, 验证所提方法的优越性。结果表明, 所提方法能够根据未知雷达状态变化动态地选择最优干扰方式, 且能更快实现对智能雷达的干扰决策。

关键词: 智能雷达, 强化学习, 部分可观测马尔可夫决策过程模型, 贝叶斯滤波

Abstract:

In order to effectively improve the jamming efficiency and accuracy of intelligent radar with unknown working mode of non partners in complex electromagnetic environment, a jamming decision method based on partially observable Markov decision process (POMDP) is proposed. Firstly, according to the working characteristics of intelligent radar, the POMDP model of intelligent radar countermeasure system is constructed, the nonparametric and sample based belief distribution is used to reflect the agent's cognition of the environment, and the Bayesian filter is used to update the agent's belief in the environment. Then, taking the information entropy as the evaluation criterion, make the jammer choose the jamming style with the largest information entropy and try again and again. Finally, the simulation results are compared with the interference decision-making performance of traditional Q-learning method and empirical decision-making method to verify the superiority of the proposed method. The results show that the proposed method can dynamically select the optimal jamming mode according to the changes of unknown radar state, and realize the jamming decision of intelligent radar faster.

Key words: intelligent radar, reinforcement learning, partially observable Markov decision process (POMDP) model, Bayesian filtering

中图分类号: