Systems Engineering and Electronics ›› 2020, Vol. 42 ›› Issue (2): 414-419.doi: 10.3969/j.issn.1001-506X.2020.02.21
Previous Articles Next Articles
					
													Qinhao ZHANG1( ), Baiqiang AO1(
), Baiqiang AO1( ), Qinxue ZHANG2(
), Qinxue ZHANG2( )
)
												  
						
						
						
					
				
Received:2019-07-26
															
							
															
							
															
							
																	Online:2020-02-01
															
							
																	Published:2020-01-23
															
						Supported by:CLC Number:
Qinhao ZHANG, Baiqiang AO, Qinxue ZHANG. Reinforcement learning guidance law of Q-learning[J]. Systems Engineering and Electronics, 2020, 42(2): 414-419.
| 1 | 聂永芳, 周卿吉, 张涛.  制导规律研究现状及展望[J]. 飞行力学, 2001, 19 (3): 7- 11. doi: 10.3969/j.issn.1002-0853.2001.03.002 | 
| NIE Y F ,  ZHOU Q J ,  ZHANG T .  Research status and prospect of guidance law[J]. Flight mechanics, 2001, 19 (3): 7- 11. doi: 10.3969/j.issn.1002-0853.2001.03.002 | |
| 2 | 郭鹏飞.基于模糊逻辑的精确末制导律研究[D].西安:西北工业大学, 2003. | 
| GUO P F. Research on precise terminal guidance law based on fuzzy logic[D]. Xi'an: Northwest polytechnic university, 2003. | |
| 3 | 李红霞.拦截大机动目标的模糊导引律研究[D].沈阳:东北大学, 2013. | 
| LI H X. Research on fuzzy guidance law for intercepting large maneuvering targets[D]. Shenyang: Northeastern University, 2013. | |
| 4 | 魏航.基于强化学习的无人机空中格斗算法研究[D].哈尔滨:哈尔滨工业大学, 2015. | 
| WEI H. Research on UAV aerial combat algorithm based on reinforcement learning[D]. Harbin: Harbin Institute of Technology, 2015. | |
| 5 | CHITHAPURAM C ,  CHERUKURI A K ,  JEPPU Y .  Aerial vehicle guidance based on passive machine learning technique[J]. International Journal of Intelligent Computing and Cybernetics, 2016, 9 (3): 255- 273. doi: 10.1108/IJICC-12-2015-0042 | 
| 6 | CHITHAPURAM C, JEPPU Y, CHERUKURI A K. Artificial intelligence learning based on proportional navigation guidance[P]. India: 10.1109/ICACCI.2013.6637338. 2013-8-22. | 
| 7 | 陈自立, 徐娅萍, 顾立彬.  基于模糊Q学习算法的AGV路径规划研究[J]. 制造业自动化, 2012, 34 (11): 4- 6, 16. doi: 10.3969/j.issn.1009-0134.2012.6(s).02 | 
| CHEN Z L ,  XU Y P ,  GU L B .  AGV path planning based on fuzzy Q learning algorithm[J]. Manufacturing automation, 2012, 34 (11): 4- 6, 16. doi: 10.3969/j.issn.1009-0134.2012.6(s).02 | |
| 8 | 葛媛, 布朋生, 刘强.  模糊强化学习在机器人导航中的应用[J]. 信息技术, 2009, 33 (10): 127- 130. doi: 10.3969/j.issn.1009-2552.2009.10.038 | 
| GE Y ,  BU P S ,  LIU Q .  Application of fuzzy reinforcement learning in robot navigation[J]. Information technology, 2009, 33 (10): 127- 130. doi: 10.3969/j.issn.1009-2552.2009.10.038 | |
| 9 | 聂春雨, 祝明, 郑泽伟, 等. 基于Q-learning算法和神经网络的飞艇控制[J]. 北京航空航天大学学报, 2017, 43 (12): 2431- 2438. | 
| NIE C Y , ZHU M , ZHENG Z W , et al. Airship control based on Q-learning algorithm and neural network[J]. Journal of Beijing university of aeronautics and astronautics, 2017, 43 (12): 2431- 2438. | |
| 10 | 谭浪, 巩庆海, 王会霞. 基于深度强化学习的追逃博弈算法[J]. 航天控制, 2018, 36 (6): 3- 8, 19. | 
| TAN L , GONG Q H , WANG H X . Pursuit game algorithm based on deep reinforcement learning[J]. Aerospace control, 2008, 36 (6): 3- 8, 19. | |
| 11 | PRASHANT B ,  FARUK K ,  NAVDEEP S .  Reinforcement learning based obstacle avoidance for autonomous underwater vehicle[J]. Journal of Marine Science and Application, 2019, 18 (2): 228- 238. doi: 10.1007/s11804-019-00089-3 | 
| 12 | PANOV A I , YAKOVLEV K S , SUVOROV R . Grid path planning with deep reinforcement learning: preliminary results[J]. Procedia Computer Science, 2018, 123 (1): 347- 353. | 
| 13 | YANG J , YOU X H , WU G X , et al. Application of reinforcement learning in UAV cluster task scheduling[J]. Future Generation Computer Systems, 2019, 95 (11): 140- 148. | 
| 14 | 张晶晶, 周德云, 张堃.  一种基于强化学习的UAV目标搜索算法[J]. 计算机应用研究, 2011, 28 (10): 3659- 3661. doi: 10.3969/j.issn.1001-3695.2011.10.014 | 
| ZHANG J J ,  ZHOU D Y ,  ZHANG K .  A UAV target search algorithm based on reinforcement learning[J]. Application Research of Computers, 2011, 28 (10): 3659- 3661. doi: 10.3969/j.issn.1001-3695.2011.10.014 | |
| 15 | LAMPTON A ,  VALASEK J ,  KUMAR M .  Multiresolution state-space discretization for Q-learning with pseudorandomized discretization[J]. Journal of Control Theory and Applications, 2011, 9 (3): 431- 439. doi: 10.1007/s11768-011-1012-4 | 
| 16 | TARN T J .  Hybrid MDP based integrated hierarchical Q-learning[J]. Science China Information Sciences, 2011, 54 (11): 2279- 2294. doi: 10.1007/s11432-011-4332-6 | 
| 17 | ZHANG W Z , LYU T S . Reactive fuzzy controller design by Q-learning for mobile robot navigation[J]. Journal of Harbin Institute of Technology, 2005, 12 (3): 319- 324. | 
| 18 | YANG B H. A novel experience-based exploration method for Q-learning[C]//Proc.of the 4th International Conference of Pioneering Computer Scientists, Engineers and Educators, 2018: 39. | 
| 19 | WANG J W. Kicking motion design of humanoid robots using gradual accumulation learning method based on Q-learning[C]// Proc.of the 28th China Control and Decision Conference, 2016: 328-333. | 
| 20 | TANG R K. An error-sensitive Q-learning approach for robot navigation[C]//Proc.of the 34th Chinese Control Conference (Volume D) Professional Committee of Control Theory of Chinese Association of Automation, 2015: 785-790. | 
| 21 | PARK K H ,  KIM Y J ,  KIM J H .  Modular Q-learning based multi-agent cooperation for robot soccer[J]. Robotics and Autonomous Systems, 2001, 35 (2): 109- 122. doi: 10.1016/S0921-8890(01)00114-2 | 
| 22 | BONARINI A , LAZARIC A , MONTRONE F , et al. Reinforcement distribution in fuzzy Q-learning[J]. Fuzzy Sets and Systems, 2008, 160 (10): 1420- 1443. | 
| 23 | LIN L X, XIE H B, ZHANG D B, et al.Supervised neural Q-learning based motion control for bionic underwater robots[C]//Proc.of the 3rd International Conference of Bionic Engineering, 2010: 178. | 
| 24 | SHI Z G. The improved Q-learning algorithm based on pheromone mechanism for swarm robot system[C]//Proc.of the 32nd Chinese Control Conference, 2013: 1131-1136. | 
| 25 | WANG H. The application of proportional navigation in the process of UAV air combat guidance and optimization of proportional parameter[C]//Proc.of the 33rd Chinese Control Conference, 2014: 1232-1236. | 
| 26 | WU S J . Illegal radio station localization with UAV-based Q-learning[J]. China Communications, 2018, 15 (12): 122- 131. | 
| 27 | 史豪斌, 徐梦.  基于强化学习的旋翼无人机智能追踪方法[J]. 电子科技大学学报, 2019, 48 (4): 553- 559. doi: 10.3969/j.issn.1001-0548.2019.04.012 | 
| SHI H B ,  XU M .  A intelligent tracking method of rotor UAV based on reinforcement learning[J]. Journal of University of Electronic Science and Technology of China, 2019, 48 (4): 553- 559. doi: 10.3969/j.issn.1001-0548.2019.04.012 | |
| 28 | ZHANG T Z. Hybrid path planning of a quadrotor UAV based on Q-learning algorithm[C]//Proc.of the 37th Chinese Control Conference, 2018: 301-305. | 
| 29 | ZHAO Y J. Q learning algorithm based UAV path learning and obstacle avoidence approach[C]//Proc.of the 36th Chinese Control Conference, 2017: 95-100. | 
| 30 | 徐小野, 李爱军, 张丛丛, 等.  基于Q学习的变体无人机控制系统设计[J]. 西北工业大学学报, 2012, 30 (3): 340- 344. doi: 10.3969/j.issn.1000-2758.2012.03.006 | 
| XU X Y ,  LI A J ,  ZHANG C C , et al.  Design of variant UAV control system based on Q learning[J]. Journal of Northwestern Polytechnical University, 2012, 30 (3): 340- 344. doi: 10.3969/j.issn.1000-2758.2012.03.006 | 
| [1] | Mengping ZHOU, Xiuyun MENG, Junhui LIU. Design of optimal sliding mode guidance law for head-on interception of maneuvering targets with large angle of fall [J]. Systems Engineering and Electronics, 2022, 44(9): 2886-2893. | 
| [2] | Zilin HOU, Ting CHENG, Han PENG. GMPHD based on measurement conversion sequential filtering for maneuvering target tracking [J]. Systems Engineering and Electronics, 2022, 44(8): 2474-2482. | 
| [3] | Bakun ZHU, Weigang ZHU, Wei LI, Ying YANG, Tianhao GAO. Research on decision-making modeling of cognitive jamming for multi-functional radar based on Markov [J]. Systems Engineering and Electronics, 2022, 44(8): 2488-2497. | 
| [4] | Guan WANG, Haizhong RU, Dali ZHANG, Guangcheng MA, Hongwei XIA. Design of intelligent control system for flexible hypersonic vehicle [J]. Systems Engineering and Electronics, 2022, 44(7): 2276-2285. | 
| [5] | Lingyu MENG, Bingli GUO, Wen YANG, Xinwei ZHANG, Zuoqing ZHAO, Shanguo HUANG. Network routing optimization approach based on deep reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(7): 2311-2318. | 
| [6] | Dongzi GUO, Rong HUANG, Hechuan XU, Liwei SUN, Naigang CUI. Research on deep deterministic policy gradient guidance method for reentry vehicle [J]. Systems Engineering and Electronics, 2022, 44(6): 1942-1949. | 
| [7] | Guang ZHAI, Yanxin WANG, Yiyong SUN. Cooperative tracking filtering technology of multi-target based on low orbit satellite constellation [J]. Systems Engineering and Electronics, 2022, 44(6): 1957-1967. | 
| [8] | Mingren HAN, Yufeng WANG. Optimization method for orbit transfer of all-electric propulsion satellite based on reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(5): 1652-1661. | 
| [9] | Shihan TAN, Fenglin JIN, Congying DUN. Task assignment strategy for space-air-ground integrated vehicular networks oriented to user demand [J]. Systems Engineering and Electronics, 2022, 44(5): 1717-1727. | 
| [10] | Li HE, Liang SHEN, Hui LI, Zhuang WANG, Wenquan TANG. Survey on policy reuse in reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(3): 884-899. | 
| [11] | Jinlin ZHANG, Jiong LI, Humin LEI, Wanli LI, Xiao TANG. Capture region of 3D realistic true proportional navigation with finite overload [J]. Systems Engineering and Electronics, 2022, 44(3): 986-997. | 
| [12] | Xiao TANG, Jikun YE, Xu LI. Design of 3D nonlinear prescribed performance guidance law [J]. Systems Engineering and Electronics, 2022, 44(2): 619-627. | 
| [13] | Bakun ZHU, Weigang ZHU, Wei LI, Ying YANG, Tianhao GAO. Multi-function radar intelligent jamming decision method based on prior knowledge [J]. Systems Engineering and Electronics, 2022, 44(12): 3685-3695. | 
| [14] | Qingqing YANG, Yingying GAO, Yu GUO, Boyuan XIA, Kewei YANG. Target search path planning for naval battle field based on deep reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(11): 3486-3495. | 
| [15] | Bin ZENG, Hongqiang ZHANG, Houpu LI. Research on anti-submarine strategy for unmanned undersea vehicles [J]. Systems Engineering and Electronics, 2022, 44(10): 3174-3181. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||