Systems Engineering and Electronics ›› 2020, Vol. 42 ›› Issue (2): 414-419.doi: 10.3969/j.issn.1001-506X.2020.02.21
Previous Articles Next Articles
Qinhao ZHANG1(), Baiqiang AO1(
), Qinxue ZHANG2(
)
Received:
2019-07-26
Online:
2020-02-01
Published:
2020-01-23
Supported by:
CLC Number:
Qinhao ZHANG, Baiqiang AO, Qinxue ZHANG. Reinforcement learning guidance law of Q-learning[J]. Systems Engineering and Electronics, 2020, 42(2): 414-419.
1 |
聂永芳, 周卿吉, 张涛. 制导规律研究现状及展望[J]. 飞行力学, 2001, 19 (3): 7- 11.
doi: 10.3969/j.issn.1002-0853.2001.03.002 |
NIE Y F , ZHOU Q J , ZHANG T . Research status and prospect of guidance law[J]. Flight mechanics, 2001, 19 (3): 7- 11.
doi: 10.3969/j.issn.1002-0853.2001.03.002 |
|
2 | 郭鹏飞.基于模糊逻辑的精确末制导律研究[D].西安:西北工业大学, 2003. |
GUO P F. Research on precise terminal guidance law based on fuzzy logic[D]. Xi'an: Northwest polytechnic university, 2003. | |
3 | 李红霞.拦截大机动目标的模糊导引律研究[D].沈阳:东北大学, 2013. |
LI H X. Research on fuzzy guidance law for intercepting large maneuvering targets[D]. Shenyang: Northeastern University, 2013. | |
4 | 魏航.基于强化学习的无人机空中格斗算法研究[D].哈尔滨:哈尔滨工业大学, 2015. |
WEI H. Research on UAV aerial combat algorithm based on reinforcement learning[D]. Harbin: Harbin Institute of Technology, 2015. | |
5 |
CHITHAPURAM C , CHERUKURI A K , JEPPU Y . Aerial vehicle guidance based on passive machine learning technique[J]. International Journal of Intelligent Computing and Cybernetics, 2016, 9 (3): 255- 273.
doi: 10.1108/IJICC-12-2015-0042 |
6 | CHITHAPURAM C, JEPPU Y, CHERUKURI A K. Artificial intelligence learning based on proportional navigation guidance[P]. India: 10.1109/ICACCI.2013.6637338. 2013-8-22. |
7 |
陈自立, 徐娅萍, 顾立彬. 基于模糊Q学习算法的AGV路径规划研究[J]. 制造业自动化, 2012, 34 (11): 4- 6, 16.
doi: 10.3969/j.issn.1009-0134.2012.6(s).02 |
CHEN Z L , XU Y P , GU L B . AGV path planning based on fuzzy Q learning algorithm[J]. Manufacturing automation, 2012, 34 (11): 4- 6, 16.
doi: 10.3969/j.issn.1009-0134.2012.6(s).02 |
|
8 |
葛媛, 布朋生, 刘强. 模糊强化学习在机器人导航中的应用[J]. 信息技术, 2009, 33 (10): 127- 130.
doi: 10.3969/j.issn.1009-2552.2009.10.038 |
GE Y , BU P S , LIU Q . Application of fuzzy reinforcement learning in robot navigation[J]. Information technology, 2009, 33 (10): 127- 130.
doi: 10.3969/j.issn.1009-2552.2009.10.038 |
|
9 | 聂春雨, 祝明, 郑泽伟, 等. 基于Q-learning算法和神经网络的飞艇控制[J]. 北京航空航天大学学报, 2017, 43 (12): 2431- 2438. |
NIE C Y , ZHU M , ZHENG Z W , et al. Airship control based on Q-learning algorithm and neural network[J]. Journal of Beijing university of aeronautics and astronautics, 2017, 43 (12): 2431- 2438. | |
10 | 谭浪, 巩庆海, 王会霞. 基于深度强化学习的追逃博弈算法[J]. 航天控制, 2018, 36 (6): 3- 8, 19. |
TAN L , GONG Q H , WANG H X . Pursuit game algorithm based on deep reinforcement learning[J]. Aerospace control, 2008, 36 (6): 3- 8, 19. | |
11 |
PRASHANT B , FARUK K , NAVDEEP S . Reinforcement learning based obstacle avoidance for autonomous underwater vehicle[J]. Journal of Marine Science and Application, 2019, 18 (2): 228- 238.
doi: 10.1007/s11804-019-00089-3 |
12 | PANOV A I , YAKOVLEV K S , SUVOROV R . Grid path planning with deep reinforcement learning: preliminary results[J]. Procedia Computer Science, 2018, 123 (1): 347- 353. |
13 | YANG J , YOU X H , WU G X , et al. Application of reinforcement learning in UAV cluster task scheduling[J]. Future Generation Computer Systems, 2019, 95 (11): 140- 148. |
14 |
张晶晶, 周德云, 张堃. 一种基于强化学习的UAV目标搜索算法[J]. 计算机应用研究, 2011, 28 (10): 3659- 3661.
doi: 10.3969/j.issn.1001-3695.2011.10.014 |
ZHANG J J , ZHOU D Y , ZHANG K . A UAV target search algorithm based on reinforcement learning[J]. Application Research of Computers, 2011, 28 (10): 3659- 3661.
doi: 10.3969/j.issn.1001-3695.2011.10.014 |
|
15 |
LAMPTON A , VALASEK J , KUMAR M . Multiresolution state-space discretization for Q-learning with pseudorandomized discretization[J]. Journal of Control Theory and Applications, 2011, 9 (3): 431- 439.
doi: 10.1007/s11768-011-1012-4 |
16 |
TARN T J . Hybrid MDP based integrated hierarchical Q-learning[J]. Science China Information Sciences, 2011, 54 (11): 2279- 2294.
doi: 10.1007/s11432-011-4332-6 |
17 | ZHANG W Z , LYU T S . Reactive fuzzy controller design by Q-learning for mobile robot navigation[J]. Journal of Harbin Institute of Technology, 2005, 12 (3): 319- 324. |
18 | YANG B H. A novel experience-based exploration method for Q-learning[C]//Proc.of the 4th International Conference of Pioneering Computer Scientists, Engineers and Educators, 2018: 39. |
19 | WANG J W. Kicking motion design of humanoid robots using gradual accumulation learning method based on Q-learning[C]// Proc.of the 28th China Control and Decision Conference, 2016: 328-333. |
20 | TANG R K. An error-sensitive Q-learning approach for robot navigation[C]//Proc.of the 34th Chinese Control Conference (Volume D) Professional Committee of Control Theory of Chinese Association of Automation, 2015: 785-790. |
21 |
PARK K H , KIM Y J , KIM J H . Modular Q-learning based multi-agent cooperation for robot soccer[J]. Robotics and Autonomous Systems, 2001, 35 (2): 109- 122.
doi: 10.1016/S0921-8890(01)00114-2 |
22 | BONARINI A , LAZARIC A , MONTRONE F , et al. Reinforcement distribution in fuzzy Q-learning[J]. Fuzzy Sets and Systems, 2008, 160 (10): 1420- 1443. |
23 | LIN L X, XIE H B, ZHANG D B, et al.Supervised neural Q-learning based motion control for bionic underwater robots[C]//Proc.of the 3rd International Conference of Bionic Engineering, 2010: 178. |
24 | SHI Z G. The improved Q-learning algorithm based on pheromone mechanism for swarm robot system[C]//Proc.of the 32nd Chinese Control Conference, 2013: 1131-1136. |
25 | WANG H. The application of proportional navigation in the process of UAV air combat guidance and optimization of proportional parameter[C]//Proc.of the 33rd Chinese Control Conference, 2014: 1232-1236. |
26 | WU S J . Illegal radio station localization with UAV-based Q-learning[J]. China Communications, 2018, 15 (12): 122- 131. |
27 |
史豪斌, 徐梦. 基于强化学习的旋翼无人机智能追踪方法[J]. 电子科技大学学报, 2019, 48 (4): 553- 559.
doi: 10.3969/j.issn.1001-0548.2019.04.012 |
SHI H B , XU M . A intelligent tracking method of rotor UAV based on reinforcement learning[J]. Journal of University of Electronic Science and Technology of China, 2019, 48 (4): 553- 559.
doi: 10.3969/j.issn.1001-0548.2019.04.012 |
|
28 | ZHANG T Z. Hybrid path planning of a quadrotor UAV based on Q-learning algorithm[C]//Proc.of the 37th Chinese Control Conference, 2018: 301-305. |
29 | ZHAO Y J. Q learning algorithm based UAV path learning and obstacle avoidence approach[C]//Proc.of the 36th Chinese Control Conference, 2017: 95-100. |
30 |
徐小野, 李爱军, 张丛丛, 等. 基于Q学习的变体无人机控制系统设计[J]. 西北工业大学学报, 2012, 30 (3): 340- 344.
doi: 10.3969/j.issn.1000-2758.2012.03.006 |
XU X Y , LI A J , ZHANG C C , et al. Design of variant UAV control system based on Q learning[J]. Journal of Northwestern Polytechnical University, 2012, 30 (3): 340- 344.
doi: 10.3969/j.issn.1000-2758.2012.03.006 |
[1] | Mengping ZHOU, Xiuyun MENG, Junhui LIU. Design of optimal sliding mode guidance law for head-on interception of maneuvering targets with large angle of fall [J]. Systems Engineering and Electronics, 2022, 44(9): 2886-2893. |
[2] | Zilin HOU, Ting CHENG, Han PENG. GMPHD based on measurement conversion sequential filtering for maneuvering target tracking [J]. Systems Engineering and Electronics, 2022, 44(8): 2474-2482. |
[3] | Bakun ZHU, Weigang ZHU, Wei LI, Ying YANG, Tianhao GAO. Research on decision-making modeling of cognitive jamming for multi-functional radar based on Markov [J]. Systems Engineering and Electronics, 2022, 44(8): 2488-2497. |
[4] | Guan WANG, Haizhong RU, Dali ZHANG, Guangcheng MA, Hongwei XIA. Design of intelligent control system for flexible hypersonic vehicle [J]. Systems Engineering and Electronics, 2022, 44(7): 2276-2285. |
[5] | Lingyu MENG, Bingli GUO, Wen YANG, Xinwei ZHANG, Zuoqing ZHAO, Shanguo HUANG. Network routing optimization approach based on deep reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(7): 2311-2318. |
[6] | Dongzi GUO, Rong HUANG, Hechuan XU, Liwei SUN, Naigang CUI. Research on deep deterministic policy gradient guidance method for reentry vehicle [J]. Systems Engineering and Electronics, 2022, 44(6): 1942-1949. |
[7] | Guang ZHAI, Yanxin WANG, Yiyong SUN. Cooperative tracking filtering technology of multi-target based on low orbit satellite constellation [J]. Systems Engineering and Electronics, 2022, 44(6): 1957-1967. |
[8] | Mingren HAN, Yufeng WANG. Optimization method for orbit transfer of all-electric propulsion satellite based on reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(5): 1652-1661. |
[9] | Shihan TAN, Fenglin JIN, Congying DUN. Task assignment strategy for space-air-ground integrated vehicular networks oriented to user demand [J]. Systems Engineering and Electronics, 2022, 44(5): 1717-1727. |
[10] | Li HE, Liang SHEN, Hui LI, Zhuang WANG, Wenquan TANG. Survey on policy reuse in reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(3): 884-899. |
[11] | Jinlin ZHANG, Jiong LI, Humin LEI, Wanli LI, Xiao TANG. Capture region of 3D realistic true proportional navigation with finite overload [J]. Systems Engineering and Electronics, 2022, 44(3): 986-997. |
[12] | Xiao TANG, Jikun YE, Xu LI. Design of 3D nonlinear prescribed performance guidance law [J]. Systems Engineering and Electronics, 2022, 44(2): 619-627. |
[13] | Bakun ZHU, Weigang ZHU, Wei LI, Ying YANG, Tianhao GAO. Multi-function radar intelligent jamming decision method based on prior knowledge [J]. Systems Engineering and Electronics, 2022, 44(12): 3685-3695. |
[14] | Qingqing YANG, Yingying GAO, Yu GUO, Boyuan XIA, Kewei YANG. Target search path planning for naval battle field based on deep reinforcement learning [J]. Systems Engineering and Electronics, 2022, 44(11): 3486-3495. |
[15] | Bin ZENG, Hongqiang ZHANG, Houpu LI. Research on anti-submarine strategy for unmanned undersea vehicles [J]. Systems Engineering and Electronics, 2022, 44(10): 3174-3181. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||