系统工程与电子技术 ›› 2022, Vol. 44 ›› Issue (8): 2488-2497.doi: 10.12305/j.issn.1001-506X.2022.08.13

• 传感器与信号处理 • 上一篇    下一篇

基于马尔可夫的多功能雷达认知干扰决策建模研究

朱霸坤1,2,*, 朱卫纲2, 李伟3, 杨莹3, 高天昊3   

  1. 1. 航天工程大学电子光学工程系, 北京 101416
    2. 电子信息系统复杂电磁环境效应国家重点实验室, 河南 洛阳 471032
    3. 航天工程大学研究生院, 北京 101416
  • 收稿日期:2021-06-01 出版日期:2022-08-01 发布日期:2022-08-24
  • 通讯作者: 朱霸坤
  • 作者简介:朱霸坤 (1997—), 男, 硕士研究生, 主要研究方向为认知电子战、雷达对抗|朱卫纲 (1973—), 女, 教授, 博士, 主要研究方向为现代信号处理、空间信息对抗、认知电子战|李伟 (1994—), 男, 硕士研究生, 主要研究方向为雷达辐射源识别|杨莹 (1997—), 女, 硕士研究生, 主要研究方向为雷达信号处理|高天昊 (1997—), 男, 硕士研究生, 主要研究方向为雷达辐射源识别
  • 基金资助:
    CEMEE国家重点实验室项目(CEMEE2020Z0203B)

Research on decision-making modeling of cognitive jamming for multi-functional radar based on Markov

Bakun ZHU1,2,*, Weigang ZHU2, Wei LI3, Ying YANG3, Tianhao GAO3   

  1. 1. Department of Electronic and Optical Engineering, Space Engineering University, Beijing 101416, China
    2. State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, Luoyang 471032, China
    3. Graduate School, Space Engineering University, Beijing 101416, China
  • Received:2021-06-01 Online:2022-08-01 Published:2022-08-24
  • Contact: Bakun ZHU

摘要:

多功能雷达是现代电磁战场上不可或缺的重要装备, 针对多功能雷达的干扰一直是一个难题。本文在研究多功能雷达信号特点和雷达对抗过程的基础上, 提出了雷达状态联合表征的方法, 将多功能雷达的干扰决策问题建模为一个带收益的马尔可夫决策过程, 设计了认知干扰决策系统, 并通过基于Q-Learning的认知干扰决策算法求解该模型下的最佳干扰策略。通过仿真实验, 证明了基于Q-Learning的认知干扰决策算法能够在缺乏先验经验的情况下学习到最佳干扰策略, 具备“认知”的特性, 并且在不稳定的环境中也具有较强的适应性, 有效支撑了本文所提的干扰决策模型。

关键词: 雷达对抗, 马尔可夫决策过程, 雷达状态, 强化学习, Q-Learning

Abstract:

Multi-functional radar is an indispensable and important equipment in modern electromagnetic battlefield. The interference of multi-functional radar is always a difficult problem. In this paper, based on the study of the characteristics of multi-functional radar signal and the radar countermeasure process, the method of joint representation of radar state is proposed, and the interference problem of multi-functional radar is modeled as a Markov decision process with benefits. The cognitive interference decision system is designed. The interference strategy is solved by the cognitive interference decision algorithm based on Q-learning. Through the simulation experiment, it is proved that the cognitive interference decision algorithm based on Q-learning can learn the optimal interference strategy in the absence of prior experience, have the characteristic of "cognition", and have strong adaptability in the unstable environment, which effectively supports the interference decision model mentioned above.

Key words: radar confrontation, Markov decision process, radar state, reinforcement learning, Q-learning

中图分类号: