系统工程与电子技术 ›› 2025, Vol. 47 ›› Issue (12): 4130-4142.doi: 10.12305/j.issn.1001-506X.2025.12.25

• 制导、导航与控制 • 上一篇    

基于强化学习的变形飞行器抗扰补偿控制方法

刘洋, 孟凡一, 陈刚   

  1. 西安交通大学航天航空学院,陕西 西安 710049
  • 收稿日期:2024-12-09 修回日期:2025-04-06 出版日期:2025-05-29 发布日期:2025-05-29
  • 通讯作者: 陈刚
  • 作者简介:刘 洋(2000—),男,硕士研究生,主要研究方向为变形飞行器智能控制方法、变形无人机飞控软硬件开发
    孟凡一(2000—),男,博士研究生,主要研究方向为智能飞行器变形决策与姿态控制、变构型姿态控制
  • 基金资助:
    国家自然科学基金(92371201,52192633);航空科学基金(ASFC-20220019070002);陕西省杰出青年基金(2022JC-03)资助课题

Reinforcement learning based disturbance rejection compensation control method for morphing aircraft

Yang LIU, Fanyi MENG, Gang CHEN   

  1. School of Aerospace Engineering,Xi’an Jiaotong University,Xi’an 710049,China
  • Received:2024-12-09 Revised:2025-04-06 Online:2025-05-29 Published:2025-05-29
  • Contact: Gang CHEN

摘要:

针对变形飞行器变形过程存在内部模型不确定性和外部多源扰动的姿态控制问题,将强化学习算法结合主动抗扰控制器,通过训练以适应飞行器外形变化及对抗系统内外扰动的最优补偿控制策略,即一种基于强化学习的抗扰补偿控制(disturbance rejection compensation control based on reinforcement learning, RL-DRCC)策略。实验中用RL-DRCC进行仿真,结合补偿前的控制效果进行对比实验,验证了所提控制策略的优越性,其中基于双延迟深度确定性策略梯度算法的抗扰补偿控制器效果最好,干扰下的控制量输出抖动被有效抑制,姿态角跟踪精度整体提高,复杂环境下姿态控制的鲁棒性有所提升。最后,对随机变形指令及扰动工况进行验证,结果表明所提方法能有效应对各种变形指令和复杂未知工况,具备很好的泛化性和自适应能力。

关键词: 深度强化学习, 变形飞行器, 多源扰动, 补偿控制

Abstract:

Aiming at the attitude control problem of morphing aircraft with internal model uncertainty and external multi-source disturbance in the morphing process, reinforcement learning algorithm is combined with active disturbance rejection controller to adapt to the aircraft shape change and combat the internal and external disturbance of the system through training, which is an optimal compensation control strategy of disturbance rejection compensation based on reinforcement learning (RL-DRCC). In the experiment, RL-DRCC is used for simulation, and the control effect before compensation is compared. The results verify the superiority of the proposed control strategy, and the disturbace rejection compensation controller based on the dual delay depth deterministic policy gradient algorithm has the best effect. The output jitter of the control quantity under disturbance is effectively suppressed, the overall tracking accuracy of the attitude angle is improved, and the robustness of the attitude control in complex environments is promoted. Finally, random morphing commands and disturbance conditions are verified. The results show that the proposed method can effectively deal with various morphing commands and complex unknown conditions, and has good generalization and adaptive ability.

Key words: deep reinforcement learning, morphing aircraft, multi-source disturbances, compensation control

中图分类号: