Systems Engineering and Electronics ›› 2024, Vol. 46 ›› Issue (3): 1038-1047.doi: 10.12305/j.issn.1001-506X.2024.03.30

• Guidance, Navigation and Control • Previous Articles     Next Articles

Robust observer-based deep reinforcement learning for attitude stabilization of vertical takeoff and landing vehicle

Yanling LI1, Feizhou LUO2, Zhilei GE1,*   

  1. 1. School of Astronautics, Northwestern Polytechnical University, Xi'an 710072, China
    2. China Academy of Launch Vehicle Technology, Beijing 100076, China
  • Received:2023-02-17 Online:2024-02-29 Published:2024-03-08
  • Contact: Zhilei GE

Abstract:

A robust observer-based proximal policy optimization (ROB-PPO) control method, which combines a robust observer and a proximal policy optimization in the deep reinforcement learning algorithm, is studied for the attitude stabilization problem of vertical takeoff and landing vehicles under the consideration of elastic vibration and model uncertainty disturbance. The method designs the robust observer to reconstruct the carrier attitude information disturbed by elastic vibration, composes the environment of the robust observer and the carrier dynamics model, and takes the reconstructed attitude obtained by the robust observer as the state of the deep reinforcement learning algorithm, so that the deep reinforcement learning intelligent body continuously interacts with it, thus training the intelligent body to control the carrier attitude stabilization. The simulation results show that the studied ROB-PPO algorithm is more robust and converges faster than the adaptive fuzzy proportional-integral-derivative (PID) algorithm commonly used today. Finally, the effectiveness of the proposed algorithm is verified on a self-developed vertical takeoff and landing vehicle.

Key words: vertical takeoff and landing vehicle, attitude control, robust observer, deep reinforcement learning

CLC Number: 

[an error occurred while processing this directive]