基于DDPG算法的双轮腿机器人运动控制研究

doi:10.12305/j.issn.1001-506X.2023.04.23

Abstract

Abstract:

Wheel-legged robots combine the mobility and flexibility of wheeled and legged robots and have a wide range of application prospects in various scenarios. Aiming at the defects of existed motion control method of two-wheel-legged robots in rough ground, their high dependence on accurate dynamic models and their lucking of adaptive solving capability, a control method of the two-wheeled-legged robot based on deep deterministic policy gradient (DDPG) algorithm is proposed. First, the two-wheel-legged robot model and its fuzzy dynamics model are analyzed. Then, the motion control policy of the two-wheel-legged robot on the rugged ground is generated using the DDPG algorithm; Finally, In order to verify the performance of the controller, three groups of motion control comparison experiments were carried out respectively. Simulation experiments show that, in the absence of prior knowledge of ground conditions, the function of the fast and stable movement of the two-wheel-legged robot in the face of rugged ground is achieved; the average speed of the motion control strategy generated by the DDPG algorithm is about 29.2% higher than that of the two-wheeled robot; the peak value of Euler angle offset is reduced by about 43.9%, 66%, and 50% compared with the bipedal robot.

Key words: motion control, reinforcement learning, wheel-legged robots, deep deterministic policy gradient (DDPG) algorithm

CLC Number:

TP242.6

Kaifeng CHEN, Borui TIAN, Heqing LI, Chenyang ZHAO, Zuxing LU, Xinde LI, Yong DENG. Research on DDPG-based motion control of two-wheel-legged robot[J]. Systems Engineering and Electronics, 2023, 45(4): 1144-1151.

Figures/Tables 15

Fig.1

Table 1

Fig.2

Fig.3

Table 2

Table 3

Fig.4

Table 4

Fig.5

Fig.6

Fig.7

Fig.8

Fig.9

Table 5

Fig.10

References 31

1	KLEMM V , MORRA A , GULICH L , et al. LQR-assisted whole-body control of a wheeled Bipedal, robot with kinematic loops[J]. IEEE Robotics and Automation Letters, 2020, 5 (2): 3745- 3752. doi: 10.1109/LRA.2020.2979625
2	WANG S, CUI L L, ZHANG J, et al. Balance control of a novel wheel-legged robot: design and experiments[C]//Proc. of the IEEE International Conference on Robotics and Automation, 2021: 6782-6788.
3	B. Dynamics. Handle[EB/OL]. [2022-02-18]. https://www.youtube.com/watch?v=-7xvqQeoA8c.
4	DE VIRAGH Y , BJELONIC M , BELLICOSO C D , et al. Trajectory optimization for wheeled-legged quadrupedal robots using linearized ZMP constraints[J]. IEEE Robotics and Automation Letters, 2019, 4 (2): 1633- 1640. doi: 10.1109/LRA.2019.2896721
5	BJELONIC M , BELLICOSO C D , VIRAGH Y D , et al. "Keep rollin"—whole-body motion control and planning for wheeled quadrupedal robots[J]. IEEE Robotics and Automation Letters, 2019, 4, 22116- 2123.
6	BJELONIC M , SANKAR P K , BELLICOSO C D , et al. Rolling in the deep-hybrid locomotion for wheeled-legged robots using online trajectory optimization[J]. IEEE Robotics and Automation Letters, 2022, 5 (2): 3626- 3633.
7	WANG S K , CHEN Z H , LI J H , et al. Flexible motion framework of the six wheel-legged robot: experimental results[J]. IEEE/ASME Trans.on Mechatronics, 2021, 27 (4): 2246- 2257.
8	ZHANG C, LI X L, ZHU X Q, et al. A step-climbing strategy of hexapod robot with eccentric wheel legs[C]//Proc. of the IEEE 7th Data Driven Control and Learning Systems Conference, 2018: 426-430.
9	KLEMM V, MORRA A, SALZMANN C, et al. Ascento: a two-wheeled jumping robot[C]//Proc. of the IEEE/RSJ International Conference on Robotics and Automation, 2019: 7515-7521.
10	CHEN S X, ROGERS J, ZHANG B K, et al. Feedback control for autonomous riding of hovershoes by a cassie bipedal robot[C]//Proc. of the IEEE-RAS 19th International Conference on Humanoid Robots, 2019.
11	CUI L L , WANG S , ZHANG J F , et al. Learning-based balance control of wheel-legged robots[J]. IEEE Robotics and Automation Letters, 2021, 6 (4): 7667- 7674. doi: 10.1109/LRA.2021.3100269
12	SUGAYA J, OHBA Y, KANMACHI T, et al. Simulation of standing upright control of an inverted pendulum using inertia rotor and the swing type inverted pendulum for engineering education[C]//Proc. of the International Conference on Information Technology and Electrical Engineering, 2017.
13	EINI R, ABDELWAHED S. Rotational inverted pendulum controller design using indirect adaptive fuzzy model predictive control[C]//Proc. of the IEEE International Conference on Fuzzy Systems, 2019.
14	XIN Y, XU J, XU B, et al. The inverted-pendulum model with consideration of pendulum resistance and its LQR controller[C]//Proc. of the International Conference on Electronic & Mechanical Engineering and Information Technology, 2011: 3438-3441.
15	张弨. 双足轮腿机器人系统设计与运动控制研究[D]. 哈尔滨: 哈尔滨工业大学, 2020.
	ZHANG Z. Bipedal wheel-legged robot system design and motion control research[D]. Harbin: Harbin Institute of Technology, 2020.
16	纪胜昊. 两足轮腿机器人系统研制及模型预测控制方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2021.
	JI S H. Development of bipedal wheel-legged robot system and model predictive control method[D]. Harbin: Harbin Institute of Technology, 2021.
17	FUJIMOTO S, VAN H H, MEGER D. Addressing function approximation error in actor-critic methods[EB/OL]. [2022-02-18]. https://arxiv.org/abs/1802.09477.
18	LOPES D S J. NASCIMENTO C L. Gait synthesis of a hybrid legged robot using reinforcement learning[C]//Proc. of the Annual IEEE Systems Conference, 2015: 439-444.
19	WONG C C , CHIEN S Y , FENG H M , et al. Motion planning for dual-arm robot based on soft actor-critic[J]. IEEE Access, 2021, 9, 26871- 26885. doi: 10.1109/ACCESS.2021.3056903
20	TOTANI M, SATO N, MORITA Y. Step climbing method for crawler type rescue robot using reinforcement learning with proximal policy optimization[C]//Proc. of the 12th International Workshop on Robot Motion and Control, 2019: 154-159.
21	VASQUEZ-JALPA C, NAKANO-MIYATAKE M, ESCAMILLA-HERNANDEZ E. A deep reinforcement learning algorithm based on modified twin delay DDPG method for robotic applications[C]//Proc. of the 21st International Conference on Control, Automation and Systems, 2021: 743-748.
22	WANG M C, RUAN X G, ZHU X Q. Heuristic gait learning of quadruped robot based on deep deterministic policy gradient algorithm[C]//Proc. of the Chinese Automation Congress, 2020: 1046-1049.
23	闫安, 陈章, 董朝阳, 等. 基于模糊强化学习的双轮机器人姿态平衡控制[J]. 系统工程与电子技术, 2021, 43 (4): 1036- 1043.
	YAN A , CHEN Z , DONG C Y , et al. Attitude balance control of two-wheeled robot based on fuzzy reinforcement learning[J]. Systems Engineering and Electronics, 2021, 43 (4): 1036- 1043.
24	XIN S Y, VIJAYAKUMAR S. Online dynamic motion planning and control for wheeled biped robots[C]//Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020: 3892-3899
25	赵玉婷, 韩宝玲, 罗庆生. 基于deep Q-network双足机器人非平整地面行走稳定性控制方法[J]. 计算机应用, 2018, 38 (9): 2459- 2463.
	ZHAO Y T , HAN B L , LUO Q S , et al. Stability control method for walking on non-level ground based on deep Q-network bipedal robot[J]. Computer Applications, 2018, 38 (9): 2459- 2463.
26	ZHANG Y C, WANG T, TAN N, et al. Open-loop motion control of a hydraulic soft robotic arm using deep reinforcement learning[C]//Proc. of the Intelligent Robotics and Applications, 2021: 13013.
27	冯春, 张祎伟, 黄成, 等. 双足机器人步态控制的深度强化学习方法[J]. 计算机集成制造系统, 2021, 27 (8): 2341- 2349.
	FENG C , ZHANG Y W , HUANG C , et al. Deep reinforcement learning method for gait control of bipedal robots[J]. Computer Integrated Manufacturing Systems, 2021, 27 (8): 2341- 2349.
28	LILLICRAP T P. Continuous control with deep reinforcement learning[EB/OL]. [2022-02-18]. https://arxiv.org/abs/1509.0297/v6.
29	SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//Proc. of the 31st International Conference on Machine Learning, 2014: 605-619.
30	周友行, 赵晗妘, 刘汉江, 等. 采用DDPG的双足机器人自学习步态规划方法[J]. 计算机工程与应用, 2021, 57 (6): 254- 259.
	ZHOU Y X , ZHAO H Y , LIU H J , et al. A self-learning gait planning method for bipedal robots using DDPG[J]. Computer Engineering and Applications, 2021, 57 (6): 254- 259.
31	HEESS N. Emergence of locomotion behaviours in rich environments[EB/OL]. [2022-02-18]. https://arxiv.org/abs/1707.02286.

单位	参数	数值
大腿长度l_u/mm		10
小腿长度l_l/mm		10
足部轮胎半径r_f/mm		3
足部轮胎厚度h_f/mm		2
关节最大转动扭矩t_max/N·m		6
圆弧拱坡最大高度h_RB_max/mm		5
圆弧拱坡最大倾角a_RB_max/rad		0.65

参数	符号
机器人位置	[x, y, z]
机器人姿态角	[roll, pitch, yaw]
机器人速度	[v_x, v_y, v_z]
轮胎与地面接触力	[F_{N_l}, F_{N_r}]
关节力矩输出	[t_{hip_l}, t_{hip_r}, t_{knee_l}, t_{knee_r}, t_{tire_l}, t_{tire_r}]
上一次关节力矩输出	[t′_{hip_l}, t′_{hip_r}, t′_{knee_l}, t′_{knee_r}, t′_{tire_l}, t′_{tire_r}]

参数	数值
学习率	0.005
折扣因子	0.99
目标更新频率	2
迷你批大小	64
经验缓冲区长度	10 000
平均分数的窗口长度	250
最大步数	3 000

机器人模型	最终奖励值	最终Q0值	训练时间/h
双轮	58.30	28.14	30.33
双足	65.22	37.86	20.28
双轮腿	76.81	129.53	15.72
平均	67.44	65.17	22.11

机器人模型	v_x平均值/(m/s)	姿态角偏移峰值/rad
机器人模型	v_x平均值/(m/s)	翻滚角	俯仰角	偏航角
双轮	0.48	0.02	0.14	0.20
双足	0.51	0.21	0.53	0.28
双轮腿	0.62	0.12	0.18	0.14

Research on DDPG-based motion control of two-wheel-legged robot

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 15

References 31

Related Articles 15

Recommended Articles

Metrics

Comments

[1]	Siqi TANG, Zhisong PAN, Guyu HU, Yang WU, Yunbo LI. Application of deep reinforcement learning in space information network——status quo and prospects [J]. Systems Engineering and Electronics, 2023, 45(3): 886-901.
[2]	Zhi REN, Dong ZHANG, Shuo TANG. Improved three-dimensional A^* algorithm of real-time path planning based on reinforcement learning [J]. Systems Engineering and Electronics, 2023, 45(1): 193-201.
[3]	Bakun ZHU, Weigang ZHU, Wei LI, Ying YANG, Tianhao GAO. Research on decision-making modeling of cognitive jamming for multi-functional radar based on Markov [J]. Systems Engineering and Electronics, 2022, 44(8): 2488-2497.
[4]	Guan WANG, Haizhong RU, Dali ZHANG, Guangcheng MA, Hongwei XIA. Design of intelligent control system for flexible hypersonic vehicle [J]. Systems Engineering and Electronics, 2022, 44(7): 2276-2285.
[5]	Lingyu MENG, Bingli GUO, Wen YANG, Xinwei ZHANG, Zuoqing ZHAO, Shanguo HUANG. Network routing optimization approach based on deep reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(7): 2311-2318.
[6]	Dongzi GUO, Rong HUANG, Hechuan XU, Liwei SUN, Naigang CUI. Research on deep deterministic policy gradient guidance method for reentry vehicle [J]. Systems Engineering and Electronics, 2022, 44(6): 1942-1949.
[7]	Mingren HAN, Yufeng WANG. Optimization method for orbit transfer of all-electric propulsion satellite based on reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(5): 1652-1661.
[8]	Li HE, Liang SHEN, Hui LI, Zhuang WANG, Wenquan TANG. Survey on policy reuse in reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(3): 884-899.
[9]	Bakun ZHU, Weigang ZHU, Wei LI, Ying YANG, Tianhao GAO. Multi-function radar intelligent jamming decision method based on prior knowledge [J]. Systems Engineering and Electronics, 2022, 44(12): 3685-3695.
[10]	Qingqing YANG, Yingying GAO, Yu GUO, Boyuan XIA, Kewei YANG. Target search path planning for naval battle field based on deep reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(11): 3486-3495.
[11]	Bin ZENG, Hongqiang ZHANG, Houpu LI. Research on anti-submarine strategy for unmanned undersea vehicles [J]. Systems Engineering and Electronics, 2022, 44(10): 3174-3181.
[12]	Qitian WAN, Baogang LU, Yaxin ZHAO, Qiuqiu WEN. Autopilot parameter rapid tuning method based on deep reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(10): 3190-3199.
[13]	Bin ZENG, Rui WANG, Houpu LI, Xu FAN. Scheduling strategies research based on reinforcement learning for wartime support force [J]. Systems Engineering and Electronics, 2022, 44(1): 199-208.
[14]	Zhiwei JIANG, Yang HUANG, Qihui WU. Anti-interference frequency allocation based on kernel reinforcement learning [J]. Systems Engineering and Electronics, 2021, 43(6): 1547-1556.
[15]	Jiayi LIU, Shaohua YUE, Gang WANG, Xiaoqiang YAO, Jie ZHANG. Cooperative evolution algorithm of multi-agent system under complex tasks [J]. Systems Engineering and Electronics, 2021, 43(4): 991-1002.