Systems Engineering and Electronics ›› 2025, Vol. 47 ›› Issue (7): 2205-2215.doi: 10.12305/j.issn.1001-506X.2025.07.14

• Systems Engineering • Previous Articles    

Attack-defense confrontation strategy of multi-UAV based on APIQ algorithm

Xiaowei FU, Xinyi WANG, Zhe QIAO   

  1. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710129, China
  • Received:2024-03-05 Online:2025-07-16 Published:2025-07-22
  • Contact: Xiaowei FU

Abstract:

Due to the large number of unmanned aerial vehicles (UAVs) in the multi-UAV confrontation environment, there may be some problems such as value function dimension explosion and difficult convergence of strategy network when using conventional deep reinforcement learning methods to deal with such problems. Therefore, a strategy, attention policy interaction Q-learning(APIQ) swarm adversarial algorithm based on value decomposition and attention mechanism is proposed. The value decomposition idea is introduced to alleviate the dimension explosion problem of value function, and the weight of each value in the value decomposition is assigned based on attention mechanism, which promotes the convergence of the policy network. In order to verify the feasibility of APIQ algorithm in the multi-UAV confrontation problem, a realistic environment model is established, and the feasibility of the algorithm is verified by simulation. The comparison with other algorithms shows that the UAV controlled by APIQ algorithm has a higher victory rate in the confrontation.

Key words: multi-unmanned aerial vehicle (UAV), reinforcement learning, value-decomposition network (VDN), attention mechanism, maneuver decision-making

CLC Number: 

[an error occurred while processing this directive]