Journal of Systems Engineering and Electronics ›› 2011, Vol. 33 ›› Issue (5): 1063-.doi: 10.3969/j.issn.1001-506X.2011.05.21

• 系统工程 • 上一篇    下一篇

基于MDP框架的飞行器隐蔽接敌策略

徐安,于雷,寇英信,徐保伟,李战武   

  1. 空军工程大学工程学院, 陕西 西安 710038
  • 出版日期:2011-05-25 发布日期:2010-01-03

Stealthy engagement maneuvering strategy for air combat based on MDP

XU An, YU Lei, KOU Ying-xin, XU Bao-wei, LI Zhan-wu   

  1. Engineering College, Air Force Engineering University, Xi’an 710038, China
  • Online:2011-05-25 Published:2010-01-03

摘要:

基于近似动态规划(approximate dynamic programming, ADP)对空战飞行器隐蔽接敌决策问题进行研究。基于作战飞行器的战术使用原则,提出了隐蔽接敌过程中的优势区域与暴露区域;构建了基于马尔科夫决策过程(Markov decision process, MDP)的隐蔽接敌策略的强化学习方法;通过态势得分函数对非连续的即时收益函数进行修正,给出了基于ADP方法的策略学习与策略提取方法。分别针对对手在有无信息源支持情况下的不同机动对策进行了仿真验证。仿真结果表明,将ADP方法应用于隐蔽接敌策略的学习是可行的, 在不同态势下可获得较为有效的接敌策略。

Abstract:

The stealthy engagement maneuvering strategy for air combat based on approximate dynamic programming (ADP) is studied. The advantage region and the exposure region are proposed based on the operational principles, and the stealthy engagement decision framework based on the Markov decision process (MDP) is established and the value iteration method based on the ADP is proposed. The immediate reward function is modified by a situation scoring function and the strategy learning method and the policy extraction method are explained. Finally, the policies in different situations when the adversary have access to outer information source and without the information are validated. The simulation results show that the application of ADP in the stealthy engagement of air combat is feasible and the policy extracted by this method in different initial situation is effective.