基于深度Q网络的无人车侦察路径规划

doi:10.12305/j.issn.1001-506X.2024.09.19

Abstract

Abstract:

In urban battlefield environments, unmanned reconnaissance vehicles help command centers better understand the situation in target areas, enhance decision-making accuracy, and reduce the threat of military operations. At present, unmanned reconnaissance vehicles mostly use Ackermann steering geometry. The path planned by the traditional algorithms does not conform to the kinematic model of the unmanned reconnaissance vehicle. Thus, the combination of bicycle motion model and deep Q-network are proposed to generate the motion trajectory of unmanned reconnaissance vehicles in an end-to-end manner. In order to solve the problems of slow learning speed and poor generalizing of deep Q-network, a deep Q-network based on experience classification according to the training characteristics of neural network and a state space with certain generalization ability are proposed. The simulation experiment results show that compared with the traditional path planning algorithms, the path planned by proposed algorithm is more in line with the movement trajectory of the unmanned reconnaissance vehicle, and which improve the learning efficiency and generalization ability of the unmanned reconnaissance vehicle.

Key words: deep reinforcement learning, unmanned reconnaissance vehicle, path planning, deep Q-network

CLC Number:

TP242

Yuqi XIA, Yanyan HUANG, Qia CHEN. Path planning for unmanned vehicle reconnaissance based on deep Q-network[J]. Systems Engineering and Electronics, 2024, 46(9): 3070-3081.

Figures/Tables 23

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Fig.6

Fig.7

Table 1

Table 2

Fig.8

Fig.9

Fig.10

Fig.11

Fig.12

Table 3

Fig.13

Fig.14

Fig.15

Fig.16

Table 4

Table 5

Fig.17

Fig.18

References 32

1	WANG X C , WANG X L , WILKES D M . Machine learning-based natural scene recognition for mobile robot localization in an unknown environment[M]. Berlin: Springer, 2019.
2	PANDA M , DAS B , SUBUDHI B , et al. A comprehensive review of path planning algorithms for autonomous underwater vehicles[J]. International Journal of Automation and Computing, 2020, 17 (3): 321- 352. doi: 10.1007/s11633-019-1204-9
3	PATLE B K , PANDEY A , PARHI D R K , et al. A review: on path planning strategies for navigation of mobile robot[J]. Defence Technology, 2019, 15 (4): 582- 606. doi: 10.1016/j.dt.2019.04.011
4	SANCHEZ-IBANEZ J R , PEREZ-DEL-PULGAR C J , GARCÍA-CEREZO A . Path planning for autonomous mobile robots: a review[J]. Sensors, 2021, 21 (23): 7898. doi: 10.3390/s21237898
5	WAHAB A W M , NEFTI-MEZIANI S , ATYABI A . A comparative review on mobile robot path planning: classical or meta-heuristic methods?[J]. Annual Reviews in Control, 2020, 50, 233- 252. doi: 10.1016/j.arcontrol.2020.10.001
6	WANG B , LIU Z , LI Q B , et al. Mobile robot path planning in dynamic environments through globally guided reinforcement learning[J]. IEEE Robotics and Automation Letters, 2020, 5 (4): 6932- 6939. doi: 10.1109/LRA.2020.3026638
7	张浩杰, 张玉东, 梁荣敏, 等. 改进A * 算法的机器人能耗最优路径规划方法[J]. 系统工程与电子技术, 2023, 45 (2): 513- 520. doi: 10.12305/j.issn.1001-506X.2023.02.23
	ZHANG H J , ZHANG Y D , LIANG R M , et al. Energy-efficient path planning method for robots based on improved A * algorithm[J]. Systems Engineering and Electronics, 2023, 45 (2): 513- 520. doi: 10.12305/j.issn.1001-506X.2023.02.23
8	李文刚, 汪流江, 方德翔, 等. 联合A * 与动态窗口法的路径规划算法[J]. 系统工程与电子技术, 2021, 43 (12): 3694- 3702. doi: 10.12305/j.issn.1001-506X.2021.12.33
	LI W G , WANG L J , FANG D X , et al. Path planning algorithm combining A * with DWA[J]. Systems Engineering and Electronics, 2021, 43 (12): 3694- 3702. doi: 10.12305/j.issn.1001-506X.2021.12.33
9	KOTHARI M , POSTLETHWAITE I . A probabilistically robust path planning algorithm for UAVs using rapidly-exploring random trees[J]. Journal of Intelligent & Robotic Systems, 2013, 71 (2): 231- 253.
10	SHI Y Y , LI Q Q , BU S Q , et al. Research on intelligent vehicle path planning based on rapidly-exploring random tree[J]. Mathematical Problems in Engineering, 2020, 2020 (1): 5910503.
11	KONATOWSKI S, PAWLOWSKI P. Ant colony optimization algorithm for UAV path planning[C]//Proc. of the 14th International Conference on Advanced Trends in Radioelecrtronics, 2018: 177-182.
12	LIANG Y , WANG L D . Applying genetic algorithm and ant colony optimization algorithm into marine investigation path planning model[J]. Soft Computing, 2020, 24 (11): 8199- 8210. doi: 10.1007/s00500-019-04414-4
13	LI W H. An improved artificial potential field method based on chaos theory for UAV route planning[C]//Proc. of the 34rd Youth Academic Annual Conference of Chinese Association of Automation, 2019: 47-51.
14	孙鹏耀, 黄炎焱, 潘尧. 基于改进势场法的移动机器人路径规划[J]. 兵工学报, 2020, 41 (10): 2106- 2121. doi: 10.3969/j.issn.1000-1093.2020.10.021
	SUN P Y , HUANG Y Y , PAN Y . Path planning of mobile robots based on improved potential field algorithm[J]. Acta Armamentarii, 2020, 41 (10): 2106- 2121. doi: 10.3969/j.issn.1000-1093.2020.10.021
15	田洪清, 王建强, 黄荷叶, 等. 越野环境下基于势能场模型的智能车概率图路径规划方法[J]. 兵工学报, 2021, 42 (7): 1496- 1505. doi: 10.3969/j.issn.1000-1093.2021.07.017
	TIAN H Q , WANG J Q , HUANG H Y , et al. Probabilistic roadmap method for path planning of intelligent vehicle based on artificial potential field model in off-road environment[J]. Acta Armamentarii, 2021, 42 (7): 1496- 1505. doi: 10.3969/j.issn.1000-1093.2021.07.017
16	SALAMAT B, TONELLO A M. A modelling approach to generate representative UAV trajectories using PSO[C]//Proc. of the 27th European Signal Processing Conference, 2019.
17	KARNOPP D . Vehicle dynamics, stability, and control[M]. Florida: CRC Press, 2013.
18	WU Z C, SU W Z, LI J H. Multi-robot path planning based on improved artificial potential field and B-spline curve optimization[C]//Proc. of the Chinese Control Conference, 2019: 4691-4696.
19	ESHTEHARDIAN S A , KHODAYGAN S . A continuous RRT^*-based path planning method for non-holonomic mobile robots using B-spline curves[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (7): 8693- 8702. doi: 10.1007/s12652-021-03625-8
20	WATKINS C J C H , DAYAN P . Q-learning[J]. Machine Learning, 1992, 8, 279- 292.
21	SUTTON R S , BARTO A G . Reinforcement learning: an introduction[M]. Cambridge: MIT press, 2018.
22	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[EB/OL]. [2023-07-01]. http://doi.org/10.48550/arXiv.1312.5602.
23	MNIH V , KAVUKCUOGLU K , SILVER D , et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518 (7540): 529- 533. doi: 10.1038/nature14236
24	WANG Y D , HE H B , SUN C Y . Learning to navigate through complex dynamic environment with modular deep reinforcement learning[J]. IEEE Trans. on Games, 2018, 10 (4): 400- 412. doi: 10.1109/TG.2018.2849942
25	DEVO A , MEZZETTI G , COSTANTE G , et al. Towards generalization in target-driven visual navigation by using deep reinforcement learning[J]. IEEE Trans. on Robotics, 2020, 36 (5): 1546- 1561. doi: 10.1109/TRO.2020.2994002
26	LI B H , WU Y J . Path planning for UAV ground target tracking via deep reinforcement learning[J]. IEEE Access, 2020, 8, 29064- 29074. doi: 10.1109/ACCESS.2020.2971780
27	LEI X Y , ZHANG Z A , DONG P F . Dynamic path planning of unknown environment based on deep reinforcement learning[J]. Journal of Robotics, 2018, 2018 (1): 5781591.
28	周彬, 郭艳, 李宁, 等. 基于导向强化Q学习的无人机路径规划[J]. 航空学报, 2021, 42 (9): 498- 505.
	ZHOU B , GUO Y , LI N , et al. Path planning of UAV using guided enhancement Q-learning algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42 (9): 498- 505.
29	杨清清, 高盈盈, 郭玙, 等. 基于深度强化学习的海战场目标搜寻路径规划[J]. 系统工程与电子技术, 2022, 44 (11): 3486- 3495. doi: 10.12305/j.issn.1001-506X.2022.11.24
	YANG Q Q , GAO Y Y , GUO Y , et al. Target search path planning for naval battle field based on deep reinforcement learning[J]. Systems Engineering and Electronics, 2022, 44 (11): 3486- 3495. doi: 10.12305/j.issn.1001-506X.2022.11.24
30	RAJAMANI R . Vehicle dynamics and control[M]. Berlin: Springer Science & Business Media, 2011.
31	OTTERLO M , WIERING M . Reinforcement learning and Markov decision processes[M]. Berlin: Springer, 2012.
32	SNIDER J M. Automatic steering methods for autonomous automobile path tracking[R]. Pittsburgh: Robotics Institute, Carnegie Mellon University, 2009.

参数	取值
无人侦察车线速度v_robot/(m/s)	5
无人侦察车前轮在车重心的距离f_len/m	0.5
无人侦察车后轮到车中心的距离r_len/m	0.5
无人侦察车自身半径d_robot/m	0.6
无人侦察车前轮最大转向角度ω_max/(°)	±15
激光数目n	15
激光探测最远距离d_max/m	7.8
无人侦察车激光与小车朝向夹角/(°)	-70, -60, -50, -40, -30, -20, -10, 0, 10, 20, 30, 40, 50, 60, 70
目标圆半径R_aim/m	1
环境刷新频率/ms	100

参数	取值
方向回报参数λ₁	0.1
方向回报参数λ₂	0.5
碰撞回报r_collision	-50
目标回报r_aim	100
强化学习折扣系数γ	0.95
强化学习贪婪因子ε	0.01
神经网络学习率lr	0.001
初始随机步数n_step	5 000
学习间隔n_learn	8
目标网络赋值间隔n_weight	10
经验池大小N	150 000
每次训练抽取样本的数目	512
最大训练回合数目M	2 500
每回合最大运行步数T_max	2 000

环境	算法运行路径长度
环境	A*	RRT	DQN	CRMDQN
环境1	66.53	77.17	69.50	64.20
环境2	61.25	76.23	81.50	65.90
环境3	84.08	82.37	95.00	74.10
环境4	87.34	85.59	78.20	69.50

环境	平均成功率达到80%所需回合数
环境	DQN	CRMDQN
环境1	240	200
环境2	470	334
环境3	未达到	631
环境4	1 162	817

环境	成功率		回报值
环境	DQN	CRMDQN	DQN	CRMDQN
环境1	0.69	0.77	49.99	63.62
环境2	0.81	0.86	74.32	80.91
环境3	0.89	0.94	88.15	96.73
环境4	0.69	0.77	49.99	63.62

Path planning for unmanned vehicle reconnaissance based on deep Q-network

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 23

References 32

Related Articles 15

Recommended Articles

Metrics

Comments

[1]	Tingyu ZHANG, Ying ZENG, Nan LI, Hongzhong HUANG. Spacecraft power-signal composite network optimization algorithm based on DRL [J]. Systems Engineering and Electronics, 2024, 46(9): 3060-3069.
[2]	Bowen FEI, Weidong BAO, Daqian LIU, Xiaomin ZHU. Air-ground cooperative autonomous task allocation method for dynamic target search and strike [J]. Systems Engineering and Electronics, 2024, 46(7): 2346-2358.
[3]	Hongda GUO, Jingtao LOU, Youchun XU, Peng YE, Yongle LI, Jinsheng CHEN. Event-triggered communication of multiple unmanned ground vehicles collaborative based on MADDPG [J]. Systems Engineering and Electronics, 2024, 46(7): 2525-2533.
[4]	Jie LI, Yuejin TAN. Operation loop recommendation method based on integrated improved ant colony algorithm [J]. Systems Engineering and Electronics, 2024, 46(6): 2002-2012.
[5]	Jiawei SUN, Minghui YU, Dapeng YANG, Haoquan TANG, Dapeng BIAN. Path planning of carrier aircraft traction system based on CL-RRT and MPC [J]. Systems Engineering and Electronics, 2024, 46(5): 1745-1755.
[6]	Dong SUI, Zhenyu YANG, Songbin DING, Tingting ZHOU. Three-dimensional path planning of UAV based on EMSDBO algorithm [J]. Systems Engineering and Electronics, 2024, 46(5): 1756-1766.
[7]	Jing YU, Xiaojun WU, Anlin JIANG, Enmi YONG. Research on UAV path planning method based on the multi-precision planning windows [J]. Systems Engineering and Electronics, 2024, 46(5): 1767-1776.
[8]	Mengyu ZHANG, Yajie DOU, Ziyi CHEN, Jiang JIANG, Kewei YANG, Bingfeng GE. Review of deep reinforcement learning and its applications in military field [J]. Systems Engineering and Electronics, 2024, 46(4): 1297-1308.
[9]	Gang LIU, Zhibiao AN, Maojun ZHANG, Yu LIU, Wu LI. Subject objective path planning algorithm based on continuous road network environment [J]. Systems Engineering and Electronics, 2024, 46(4): 1346-1356.
[10]	Guixiang ZHAO, Jian ZHOU, Yunmiao LI, Chenxu WANG. Improved bi-directional rapidly-exploring random tree path planning for USV [J]. Systems Engineering and Electronics, 2024, 46(4): 1364-1371.
[11]	Yanling LI, Feizhou LUO, Zhilei GE. Robust observer-based deep reinforcement learning for attitude stabilization of vertical takeoff and landing vehicle [J]. Systems Engineering and Electronics, 2024, 46(3): 1038-1047.
[12]	Zihao CHEN, Juan LI, Chang LIU, Jie LI, Xiaoyu LIU. Task planning method for coordinated attacks on ground targets under time constraints [J]. Systems Engineering and Electronics, 2023, 45(8): 2353-2360.
[13]	Guangqiang LI, Wenchao DONG, Daqing ZHU, Yue YU, Hao CHEN, Shuanghe YU. 3D path planning for AUV based on improved whaleoptimization algorithm [J]. Systems Engineering and Electronics, 2023, 45(7): 2170-2182.
[14]	Liyao WU, Xichao SU, Lei WANG, Zishuang PAN. Research of formation rendezvous control for manned/unmanned aerial vehicles formation [J]. Systems Engineering and Electronics, 2023, 45(7): 2192-2202.
[15]	Fengguo WU, Wei TAO, Hui LI, Jianwei ZHANG, Chengchen ZHENG. UAV intelligent avoidance decisions based on deep reinforcement learning algorithm [J]. Systems Engineering and Electronics, 2023, 45(6): 1702-1711.