

系统工程与电子技术 ›› 2025, Vol. 47 ›› Issue (12): 4130-4142.doi: 10.12305/j.issn.1001-506X.2025.12.25
• 制导、导航与控制 • 上一篇
刘洋, 孟凡一, 陈刚
收稿日期:2024-12-09
修回日期:2025-04-06
出版日期:2025-05-29
发布日期:2025-05-29
通讯作者:
陈刚
作者简介:刘 洋(2000—),男,硕士研究生,主要研究方向为变形飞行器智能控制方法、变形无人机飞控软硬件开发基金资助:Yang LIU, Fanyi MENG, Gang CHEN
Received:2024-12-09
Revised:2025-04-06
Online:2025-05-29
Published:2025-05-29
Contact:
Gang CHEN
摘要:
针对变形飞行器变形过程存在内部模型不确定性和外部多源扰动的姿态控制问题,将强化学习算法结合主动抗扰控制器,通过训练以适应飞行器外形变化及对抗系统内外扰动的最优补偿控制策略,即一种基于强化学习的抗扰补偿控制(disturbance rejection compensation control based on reinforcement learning, RL-DRCC)策略。实验中用RL-DRCC进行仿真,结合补偿前的控制效果进行对比实验,验证了所提控制策略的优越性,其中基于双延迟深度确定性策略梯度算法的抗扰补偿控制器效果最好,干扰下的控制量输出抖动被有效抑制,姿态角跟踪精度整体提高,复杂环境下姿态控制的鲁棒性有所提升。最后,对随机变形指令及扰动工况进行验证,结果表明所提方法能有效应对各种变形指令和复杂未知工况,具备很好的泛化性和自适应能力。
中图分类号:
刘洋, 孟凡一, 陈刚. 基于强化学习的变形飞行器抗扰补偿控制方法[J]. 系统工程与电子技术, 2025, 47(12): 4130-4142.
Yang LIU, Fanyi MENG, Gang CHEN. Reinforcement learning based disturbance rejection compensation control method for morphing aircraft[J]. Systems Engineering and Electronics, 2025, 47(12): 4130-4142.
表3
奖励函数权重"
| 权重参数 | 数值 |
| −1, −2, −3 | |
| −10, −10, 5, 2, 10 | |
| 1.5°, 5°, 0.05°, 0.05°, 0.05° | |
| −0.1, −0.1, −0.2 | |
| 0, −0.5, 0 |
表4
基础控制器参数"
| 参数项 | 取值 |
| 100, 300, 500 | |
| 100, 300, 500 | |
| 100, 300, 500 | |
| 1, 1, 1 | |
| 0.1, 0.1, 0.1 |
表5
优化后的基础控制器参数"
| 参数项 | 取值 |
| 100,500, | |
| 100,800, | |
| 100, | |
| 8,1,1 | |
| 0.1,0.2,0.3 |
表6
扰动工况1"
| 参数 | TD3-DRCC | SAC-DRCC | 抗扰基础控制器 | |
表7
扰动工况2"
| 参数 | TD3-DRCC | SAC-DRCC | 抗扰基础控制器 | |
| 1 |
WEISSHAAR T A. Morphing aircraft systems: historical perspectives and future challenges[J]. Journal of Aircraft, 2013, 50 (2): 337- 353.
doi: 10.2514/1.C031456 |
| 2 |
AFONSO F, VALE J, LAU F, et al. Performance based multidisciplinary design optimization of morphing aircraft[J]. Aerospace Science and Technology, 2017, 67, 1- 12.
doi: 10.1016/j.ast.2017.03.029 |
| 3 |
AJAJ R M, FRISWELL M I, BOURCHAK M, et al. Span morphing using the GNATSpar wing[J]. Aerospace Science and Technology, 2016, 53, 38- 46.
doi: 10.1016/j.ast.2016.03.009 |
| 4 | 周雨欣, 王鹏, 汤国建, 等. 基于干扰观测器的变形飞行器预设性能控制[J]. 战术导弹技术, 2024 (4): 72- 82. |
| ZHOU Y X, WANG P, TANG G J, et al. Disturbance observer-based prescribed performance control for morphing aircraft[J]. Tactical Missile Technology, 2024 (4): 72- 82. | |
| 5 |
WU Q, LIU Z H, LIU F N, et al. LPV-based self-adaption integral sliding mode controller with L2 gain performance for a morphing aircraft[J]. IEEE Access, 2019, 7, 81515- 81531.
doi: 10.1109/ACCESS.2019.2923313 |
| 6 |
YUE T, ZHANG X Y, WANG L X, et al. Flight dynamic modeling and control for a telescopic wing morphing aircraft via asymmetric wing morphing[J]. Aerospace Science and Technology, 2017, 70, 328- 338.
doi: 10.1016/j.ast.2017.08.013 |
| 7 |
李珂澄, 刘小雄, 李煜, 等. 基于自抗扰控制的变体飞机机动控制研究[J]. 西北工业大学学报, 2024, 42 (4): 662- 672.
doi: 10.1051/jnwpu/20244240662 |
|
LI K C, LIU X X, LI Y, et al. Research on morphing aircraft maneuver control based on active disturbance rejection control[J]. Journal of Northwestern Polytechnical University, 2024, 42 (4): 662- 672.
doi: 10.1051/jnwpu/20244240662 |
|
| 8 | 孟志鹏, 杨柳庆, 王波, 等. 基于改进平衡优化算法的折叠翼飞行器自抗扰控制器设计[J]. 北京航空航天大学学报, 2024, 50 (8): 2449- 2460. |
| MENG Z P, YANG L Q, WANG B, et al. ADRC design for folding wing vehicles based on improved equilibrium optimization algorithm[J]. Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (8): 2449- 2460. | |
| 9 | 宋慧心, 金磊. 折叠翼飞行器的动力学建模与稳定控制[J]. 力学学报, 2020, 52 (6): 1548- 1559. |
| SONG H X, JIN L. Dynamic modeling and stability control of folding wing aircraft[J]. Chinese Journal of Theoretical and Applied Mechanics, 2020, 52 (6): 1548- 1559. | |
| 10 |
MENG F Y, WANG T J, CHEN G. Prescribed performance-based active anti-disturbance backstepping control for morphing aircraft[J]. Aerospace Science and Technology, 2024, 152, 109386.
doi: 10.1016/j.ast.2024.109386 |
| 11 |
PU J L, ZHANG Y H, GUAN Y Z, et al. Recurrent neural network-based predefined time control for morphing aircraft with asymmetric time-varying constraints[J]. Applied Mathematical Modelling, 2024, 135, 578- 600.
doi: 10.1016/j.apm.2024.06.024 |
| 12 | 浦甲伦, 詹韬, 李博皓, 等. 助推-滑翔飞行器再入过程强化学习自抗扰控制[J]. 战术导弹技术, 2024 (2): 117- 125. |
| PU J L, ZHAN T, LI B H, et al. Reinforcement learning-based active disturbance rejection control for boost-glide vehicle in re-entry phase[J]. Tactical Missile Technology, 2024 (2): 117- 125. | |
| 13 | 何海洋, 赵振根, 孔飞. 基于深度强化学习的固定翼无人机纵向控制[EB/OL]. [2024-11-09]. https://doi.org/10.13700/j.bh.1001-5965.2024.0075 |
| HE H Y, ZHAO Z G, KONG F, Longitudinal control of fixed-wing UAV based on deep reinforcement learning[EB/OL]. [2024-11-09]. https://doi.org/10.13700/j.bh.1001-5965.2024.0075. | |
| 14 | 黄旭, 柳嘉润, 贾晨辉, 等. 强化学习控制方法及在类火箭飞行器上的应用[J]. 宇航学报, 2023, 44 (5): 708- 718. |
| HUANG X, LIU J R, JIA C H, et al. Reinforcement learning control and its application on rocket-like vehicle[J]. Journal of Astronautics, 2023, 44 (5): 708- 718. | |
| 15 | LIU Y C, HUANG C Y. DDPG-based adaptive robust tracking control for aerial manipulators with decoupling approach[J]. IEEE Trans. on Cybernetics, 2021, 52 (8): 8258- 8271. |
| 16 |
WU Z H, LU J C, ZHOU Q, et al. Modified adaptive neural dynamic surface control for morphing aircraft with input and output constraints[J]. Nonlinear Dynamics, 2017, 87 (4): 2367- 2383.
doi: 10.1007/s11071-016-3196-0 |
| 17 | WU Z H, LU J C, SHI J P, et al. Robust adaptive neural control of morphing aircraft with prescribed performance[EB/OL]. [2024-11-09]. https://doi.org/10.1155/2017/1401427. |
| 18 | 路坤锋, 贾晨辉, 黄旭, 等. 面向变构型飞行器的强化学习位置姿态一体化控制方法[J]. 宇航学报, 2024, 45 (7): 1100- 1110. |
| LU K F, JIA C H, HUANG X, et al. Reinforcement learning-based integrated position and attitude control method towards morphing flight vehicles[J]. Journal of Astronautics, 2024, 45 (7): 1100- 1110. | |
| 19 | 丁溶, 曹承钰, 李繁飙, 等. 基于深度强化学习的变外形飞行器姿态控制[J]. 航天控制, 2024, 42 (2): 55- 61. |
| DING R, CAO C Y, LI F B, et al. Attitude control of morphing vehicle based on reinforcement learning[J]. Aerospace Control, 2024, 42 (2): 55- 61. | |
| 20 |
PI C H, YE W Y, CHENG S. Robust quadrotor control through reinforcement learning with disturbance compensation[J]. Applied Sciences, 2021, 11 (7): 3257.
doi: 10.3390/app11073257 |
| 21 | 马少捷, 惠俊鹏, 王宇航, 等. 变形飞行器深度强化学习姿态控制方法研究[J]. 航天控制, 2022, 40 (6): 3- 10. |
| MA S J, HUI J P, WANG Y H, et al. Research on attitude method of morphing aircraft based on deep reinforcement learning[J]. Aerospace Control, 2022, 40 (6): 3- 10. | |
| 22 |
ZHENG Y M, TAO J, SUN Q L, et al. Soft actor-critic based active disturbance rejection path following control for unmanned surface vessel under wind and wave disturbances[J]. Ocean Engineering, 2022, 247, 110631.
doi: 10.1016/j.oceaneng.2022.110631 |
| 23 | 王思鹏, 杜昌平, 郑耀. 基于强化学习的扑翼飞行器路径规划算法[J]. 控制与决策, 2022, 37 (4): 851- 860. |
| WANG S P, DU C P, ZHENG Y. Path planning algorithm for flapping-wing aircraft based on reinforcement learning[J]. Control and Decision, 2022, 37 (4): 851- 860. | |
| 24 |
JIN H Y, GAO Z Q. On the notions of normality, locality, and operational stability in ADRC[J]. Control Theory and Technology, 2023, 21, 97- 109.
doi: 10.1007/s11768-023-00131-4 |
| 25 | 韩京清. 自抗扰控制技术——估计补偿不确定因素的控制技术[M]. 北京: 国防工业出版社, 2008. |
| HAN J Q. Active disturbance rejection control technique—the technique for estimating and compensating the uncertainties[M]. Beijing: National Defense Industry Press, 2008. | |
| 26 | 韩京清. 自抗扰控制器及其应用[J]. 控制与决策, 1998 (1): 19- 23. |
| HAN J Q. Active disturbance rejection controller and its application[J]. Control and Decision, 1998 (1): 19- 23. | |
| 27 | LILLICRAP T P. Continuous control with deep reinforcement learning[EB/OL]. [2024-11-09]. https://arxiv.org/abs/1509.02971. |
| 28 | FUJIMOTO S, HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[C]//Proc. of the 35th International Conference on Machine Learning, 2018: 1587−1596. |
| 29 |
LIU J H, SHAN J Y, WANG J N, et al. Incremental sliding-mode control and allocation for morphing-wing aircraft fast manoeuvring[J]. Aerospace Science and Technology, 2022, 131, 107959.
doi: 10.1016/j.ast.2022.107959 |
| 30 | HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications[EB/OL]. [2024-11-14]. http://arxiv.org/abs/1812.05905. |
| 31 | HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//Proc. of the 35th International Conference on Machine Learning, 2018: 1861−1870. |
| [1] | 魏潇龙, 吴亚荣, 姚登凯, 赵顾颢. 基于深度强化学习的无人机空战机动分层决策算法[J]. 系统工程与电子技术, 2025, 47(9): 2993-3003. |
| [2] | 朱运豆, 孙海权, 胡笑旋. 基于指针网络架构的多星协同成像任务规划方法[J]. 系统工程与电子技术, 2025, 47(7): 2246-2255. |
| [3] | 孟麟芝, 孙小涓, 胡玉新, 高斌, 孙国庆, 牟文浩. 面向卫星在轨处理的强化学习任务调度算法[J]. 系统工程与电子技术, 2025, 47(6): 1917-1929. |
| [4] | 郑康洁, 张新宇, 王伟菘, 刘震生. DQN与规则结合的智能船舶动态自主避障决策[J]. 系统工程与电子技术, 2025, 47(6): 1994-2001. |
| [5] | 刘书含, 李彤, 李富强, 杨春刚. 意图态势双驱动的数据链抗干扰通信机制[J]. 系统工程与电子技术, 2025, 47(6): 2055-2064. |
| [6] | 熊威, 张栋, 任智, 杨书恒. 面向有人/无人机协同打击的智能决策方法研究[J]. 系统工程与电子技术, 2025, 47(4): 1285-1299. |
| [7] | 马鹏, 蒋睿, 王斌, 徐盟飞, 侯长波. 基于隐式对手建模的策略重构抗智能干扰方法[J]. 系统工程与电子技术, 2025, 47(4): 1355-1363. |
| [8] | 唐开强, 傅汇乔, 刘佳生, 邓归洲, 陈春林. 基于深度强化学习的带约束车辆路径分层优化研究[J]. 系统工程与电子技术, 2025, 47(3): 827-841. |
| [9] | 陈夏瑢, 李际超, 陈刚, 刘鹏, 姜江. 基于异质网络的装备体系组合发展规划问题[J]. 系统工程与电子技术, 2025, 47(3): 855-861. |
| [10] | 张耀中, 吴卓然, 张建东, 杨啟明, 史国庆, 徐自祥. 基于ME-DDPG算法的无人机多对一追逃博弈[J]. 系统工程与电子技术, 2025, 47(10): 3288-3299. |
| [11] | 隋东, 蔡向嵘. 智能飞行冲突解脱算法的持续学习机制[J]. 系统工程与电子技术, 2025, 47(10): 3300-3312. |
| [12] | 张庭瑜, 曾颖, 李楠, 黄洪钟. 基于深度强化学习的航天器功率-信号复合网络优化算法[J]. 系统工程与电子技术, 2024, 46(9): 3060-3069. |
| [13] | 夏雨奇, 黄炎焱, 陈恰. 基于深度Q网络的无人车侦察路径规划[J]. 系统工程与电子技术, 2024, 46(9): 3070-3081. |
| [14] | 杨志鹏, 陈子浩, 曾长, 林松, 毛金娣, 张凯. 复杂环境下的飞行器在线航路规划决策方法[J]. 系统工程与电子技术, 2024, 46(9): 3166-3175. |
| [15] | 郭宏达, 娄静涛, 徐友春, 叶鹏, 李永乐, 陈晋生. 基于MADDPG的多无人车协同事件触发通信[J]. 系统工程与电子技术, 2024, 46(7): 2525-2533. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||