系统工程与电子技术 ›› 2023, Vol. 45 ›› Issue (3): 886-901.doi: 10.12305/j.issn.1001-506X.2023.03.31

• 通信与网络 • 上一篇    下一篇

深度强化学习在天基信息网络中的应用——现状与前景

唐斯琪1, 潘志松1,*, 胡谷雨1, 吴炀2, 李云波1   

  1. 1. 陆军工程大学指挥控制工程学院, 江苏 南京 210007
    2. 北京信息通信技术研究中心, 北京 100036
  • 收稿日期:2021-09-23 出版日期:2023-02-25 发布日期:2023-03-09
  • 通讯作者: 潘志松
  • 作者简介:唐斯琪(1993—), 女, 博士研究生, 主要研究方向为卫星网络资源分配、智能卫星网络
    潘志松(1973—), 男, 教授, 博士, 主要研究方向为模式识别、人工智能
    胡谷雨(1963—), 男, 教授, 博士, 主要研究方向为计算机网络、通信网络管理和网络智能化技术
    吴炀(1992—), 男, 博士研究生, 主要研究方向为卫星网络组网技术
    李云波(1993—), 男, 讲师, 硕士, 主要研究方向为人工智能
  • 基金资助:
    国家自然科学基金(62076251)

Application of deep reinforcement learning in space information network——status quo and prospects

Siqi TANG1, Zhisong PAN1,*, Guyu HU1, Yang WU2, Yunbo LI1   

  1. 1. Command & Control Engineering College, Army Engineering University, Nanjing 210007, China
    2. Beijing Information and Communications Technology Research Center, Beijing 100036, China
  • Received:2021-09-23 Online:2023-02-25 Published:2023-03-09
  • Contact: Zhisong PAN

摘要:

未来天基信息网络(space information network, SIN)领域将面临由结构复杂、环境动态、业务多样等发展趋势带来的挑战。数据驱动的深度强化学习(deep reinforcement learning, DRL)作为一种应对上述挑战的可行思路被引入SIN领域。首先简要介绍了DRL的基本方法, 并全面回顾了其在SIN领域的研究进展。随后, 以星地网络场景的中继选择为例, 针对大规模节点问题提出了基于平均场的DRL算法, 并提出一种基于微调的模型迁移机制, 用以解决仿真环境与真实环境之间的数据差异问题。仿真证明了其对网络性能优化的效果, 且计算复杂度和时间效率均具有可行性。在此基础上归纳和总结了DRL方法在SIN领域的局限性与面临的挑战。最后,结合强化学习前沿进展, 讨论了此领域未来的努力方向。

关键词: 天基信息网络, 深度强化学习, 中继选择, 网络性能优化

Abstract:

Space information network (SIN) will face challenges from its development trend of complex structure, dynamic environment, and diverse types of emerging applications. In this context, the data-driven deep reinforcement learning (DRL) methods are introduced into SIN field as one of the promising solutions to cope with the aforementioned challenges. This paper firstly reviewed commonly used basic DRL methods in SIN field, with a comprehensive literature review of DRL-based SIN methods. Then, considering the relay selection in satellite-terrestrial network as an example, we propose an algorithm based on mean field DRL to address the large-scale issue. A model transfer mechanism based on finetune is proposed to solve the problem of data difference between simulation environment and real environment. Simulation result demonstrates that the proposed method can optimize network performance with acceptable computational complexity and time efficiency. Then the limitations and challenges of DRL in the field of SIN are summarized. Finally, based on frontier hotspots of DRL, we further provide insights into several future research directions in the context of DRL-based SIN methods.

Key words: space information network(SIN), deep reinforcement learning(DRL), relay selection, network performance optimization

中图分类号: