系统工程与电子技术 ›› 2025, Vol. 47 ›› Issue (6): 1994-2001.doi: 10.12305/j.issn.1001-506X.2025.06.27

• 制导、导航与控制 • 上一篇    下一篇

DQN与规则结合的智能船舶动态自主避障决策

郑康洁1, 张新宇1,*, 王伟菘1, 刘震生2   

  1. 1. 大连海事大学航海学院, 辽宁 大连 116026
    2. 西北工业大学航天学院, 陕西 西安 710072
  • 收稿日期:2024-06-11 出版日期:2025-06-25 发布日期:2025-07-09
  • 通讯作者: 张新宇
  • 作者简介:郑康洁(1995—), 女, 博士研究生, 主要研究方向为智能船舶自主航行关键技术、强化学习、人工智能
    张新宇(1979—), 男, 教授, 博士, 主要研究方向为智能船舶自主航行关键技术、水陆交通运输与系统工程学、船舶交通协同组织调度、海事大数据
    王伟菘(1997—), 男, 硕士研究生, 主要研究方向为智能船舶自主航行关键技术、态势感知
    刘震生(1996—), 男, 硕士研究生, 主要研究方向为智能船舶自主航行关键技术、态势感知
  • 基金资助:
    国家自然科学基金(52371359);大连市科技创新基金(2022JJ12GX015)

Intelligent ship dynamic autonomous obstacle avoidance decision based on DQN and rule

Kangjie ZHENG1, Xinyu ZHANG1,*, Weisong WANG1, Zhensheng LIU2   

  1. 1. Navigation College, Dalian Maritime University, Dalian 116026, China
    2. School of Astronautics, Northwestern Polytechnical University, Xi'an 710072, China
  • Received:2024-06-11 Online:2025-06-25 Published:2025-07-09
  • Contact: Xinyu ZHANG

摘要:

针对智能船舶避碰决策面临反复训练、难以灵活适应多样化会遇场景等问题。提出一种深度Q-网络(deep Q-network, DQN)与规则结合的智能船舶动态自主避障决策算法, 设计融合规则评估的部分可观测自主避障模型, 并结合深度强化学习对深度网络进行改进和训练。通过选择随机起点和终点的训练方式, 算法使智能船舶在无需反复训练的情况下, 能在动态和静态场景相结合的环境中实现自主避碰。仿真实验验证了算法无需重复训练即可实现自主避碰决策, 降低训练成本, 具有一定的泛化能力和鲁棒性, 可为智能船舶在复杂航行环境中的自主避碰提供解决方案。

关键词: 动态自主避障, 智能船舶, 免重复训练, 深度强化学习

Abstract:

Current intelligent ship collision avoidance decision-making faces challenges such as repetitive training and difficulty in adapting to diverse encounter scenarios. An intelligent ship dynamic autonomous obstacle avoidance decision-making algorithm based on deep Q-network (DQN) is proposed. The proposed algorithm designs a partially observable autonomous obstacle avoidance model that improves and trains deep network through deep reinforcement learning. By employing a training approach with random start and end points, the proposed algorithm enables intelligent ships to achieve autonomous collision avoidance in environments combining dynamic and static scenarios without the need for repetitive training. Simulation experiments validate that the proposed algorithm can achieve autonomous collision avoidance decision-making without repeated training, thereby reducing training costs. It demonstrates a certain level of generalization capability and robustness, offering a solution for autonomous collision avoidance in complex navigation environment for intelligent ships.

Key words: dynamic autonomous obstacle avoidance, intelligent ship, without repetitive training, deep reinforcement learning (DRL)

中图分类号: