系统工程与电子技术

• 系统工程 • 上一篇    下一篇

ABMS中基于Q学习算法的空战目标分配方法

谢俊洁1, 罗鹏程1, 穆富岭2, 王骏1, 丁帅1   

  1. 1. 国防科学技术大学信息系统与管理学院, 湖南 长沙 410073;
    2. 复杂航空系统仿真实验室, 北京 100076
  • 出版日期:2017-02-25 发布日期:2010-01-03

Air combat target assignment in ABMS based on Q-learning algorithm

XIE Junjie1, LUO Pengcheng1, MU Fuling2, WANG Jun1, DING Shuai1   

  1. 1. The Institute of Information System and Management, National University of Defense Technology,
     Changsha 410073, China; 2. Complex Aviation System Simulation Laboratory, Beijing 100076, China
  • Online:2017-02-25 Published:2010-01-03

摘要:

Q学习算法由于不需要先验知识即可学习,对于求解复杂的优化决策问题具有广泛的应用前景。本文针对当前空战目标分配算法的优缺点,提出了ABMS(agentbased modeling and simulation)中基于Q学习算法的空战目标分配方法。首先介绍了空战Agent建模;然后给出了Q学习算法应用于空战目标分配的方法流程,并严格定义了“状态动作”对的选择规则,最后通过仿真实验证明了该方法的合理性和有效性。本文方法避免了对先验知识的依赖,并且脱离了局部最优陷阱。

Abstract:

Q-learning algorithm can study without prior knowledge, and it is good at solving complicated optimal decision problems in many fields. by analyzing the popular algorithms for air combat target assignment, a Q-learning algorithm is proposed for solving it in agent-based modeling and simulation (ABMS). Firstly, modeling of this problem is introduced in the attributions, structure and action rules. Then, the flowchart of the Q-learning algorithm is given out. Furthermore, the criteria of state-action-pair are well defined. Finally, the simulation results show that the method is reasonable and valid. The method can avoid relying on the prior knowledge and get out of the local optimal solution.