Systems Engineering and Electronics ›› 2025, Vol. 47 ›› Issue (6): 1917-1929.doi: 10.12305/j.issn.1001-506X.2025.06.20

• Systems Engineering • Previous Articles     Next Articles

Reinforcement learning task scheduling algorithm for satellite on-orbit processing

Linzhi MENG1,2,3, Xiaojuan SUN1,2,3,*, Yuxin HU1,2,3, Bin GAO1,2, Guoqing SUN1,2, Wenhao MU1,2   

  1. 1. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China
    2. Key Laboratory of Technology in Geo-spatial Information Processing and Application System, Beijing 100190, China
    3. School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2024-06-26 Online:2025-06-25 Published:2025-07-09
  • Contact: Xiaojuan SUN

Abstract:

As satellite earth observation enters an era of multiple satellites, high resolution, real-time response, and global observation, satellite on-orbit data processing has become one of the main methods to improve the real-time characteristic of remote sensing data processing. In scenarios where satellite resources are limited, data transmission link channels are constrained, and opportunistic observation tasks are unpredictable, real-time scheduling of data processing tasks faces significant challenges. An optimization problem model with the goal of maximizing the system's average data processing throughput rate is firstly constructed. Secondly, an online task scheduling algorithm that combines deep reinforcement learning (DRL) is proposed. DRL algorithm enables real-time calculation of task scheduling strategies, and Lagrangian dual optimization algorithm can accurately computes the optimal resource allocation. Finally, simulation experiments are conducted to evaluate the effectiveness and data processing throughput rate of the proposed algorithm. Results show that the proposed algorithm can converge and approach the optimal solution, improving data processing throughput rate by approximately 8% compared to existing algorithms, and demonstrating scalability as the satellite data arrival speed and the number of satellite computing nodes increase.The proposed algorithm can maximize the average data processing throughput rate of the system while ensuring the stability and convergence of task queue length and average energy consumption in a high-dynamic environment.

Key words: satellite on-orbit processing, task scheduling, resource allocation, deep reinforcement learning(DRL), Lyapunov optimization

CLC Number: 

[an error occurred while processing this directive]