| 1 | 戴全辉.  巡航导弹武器系统伪装生存与隐身突防研究[J]. 战术导弹技术, 2020, 4, 41- 46. | 
																													
																							|  | DAI Q H .  Research on camouflage survival and stealth penetration on cruise missile weapon system[J]. Tactical Missile Technology, 2020, 4, 41- 46. | 
																													
																							| 2 | WEI M, CHEN G S, CRVZ J B, et al. Game theoretic strategies for intercepting intelligent cruise missiles[C]//Proc. of the AIAA Guidance, Navigation and Control Conference and Exhi-bit, 2007: 20-23. | 
																													
																							| 3 | YANUSHEVSKY R .  Modern missile guidance[M]. London: Taylor & Francis Inc, 2007. | 
																													
																							| 4 | SUTTON R S ,  BARTO A G .  Reinforcement learning: an introduction[J]. Trends in Cognitive Sciences, 1998, 3 (9): 360. | 
																													
																							| 5 | KONDA V R, TSITSIKLIS J N. Actor-critic algorithms[C]// Proc. of the Advances in Neural Information Processing Systems, 2000: 1008-1014. | 
																													
																							| 6 | 梁星星, 冯旸赫, 马扬, 等.  多Agent深度强化学习综述[J]. 自动化学报, 2020, 46 (12): 2537- 2557. | 
																													
																							|  | LIANG X X ,  FENG Y H ,  MA Y , et al.  Deep multi-agent reinforcement learning: a survey[J]. Actc Automatica Sinica, 2020, 46 (12): 2537- 2557. | 
																													
																							| 7 | NA H ,  LEE J I .  Optimal arrangement of missile defense systems considering kill probability[J]. IEEE Trans.on Aerospace and Electronic Systems, 2019, 56 (2): 972- 983. | 
																													
																							| 8 | HUANG C P ,  DONG K S ,  HUANG H Q , et al.  Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization[J]. Journal of Systems Engineering and Electronics, 2018, 29 (1): 86- 97. doi: 10.21629/JSEE.2018.01.09
 | 
																													
																							| 9 | 马子杰, 高杰, 武沛羽, 等.  用于巡航导弹突防航迹规划的改进深度强化学习算法[J]. 电子技术应用, 2021, 47 (8): 11- 14. 11-14, 19 | 
																													
																							|  | MA Z J ,  GAO J ,  WU P Y , et al.  An improved deep reinforcement learning algorithm for cruise missile penetration path planning[J]. Application of Electronic Technique, 2021, 47 (8): 11- 14. 11-14, 19 | 
																													
																							| 10 | LI B H, MA F, WU Y J. Missile attitude control based on deep reinforcement learning[C]//Proc. of the IEEE 16th International Conference on Control & Automation, 2020: 931-936. | 
																													
																							| 11 | LEE G T ,  KIM C O .  Autonomous control of combat unmanned aerial vehicles to evade surface-to-air missiles using deep reinforcement learning[J]. IEEE Access, 2020, 8, 226724- 226736. doi: 10.1109/ACCESS.2020.3046284
 | 
																													
																							| 12 | LIANG C ,  WANG W H ,  LIU Z H , et al.  Learning to guide: guidance law based on deep meta-learning and model predictive path integral control[J]. IEEE Access, 2019, 7, 47353- 47365. doi: 10.1109/ACCESS.2019.2909579
 | 
																													
																							| 13 | HU D Y ,  YANG R N ,  ZUO J L , et al.  Application of deep reinforcement learning in maneuver planning of beyond-visual-range air combat[J]. IEEE Access, 2021, 9, 32282- 32297. doi: 10.1109/ACCESS.2021.3060426
 | 
																													
																							| 14 | DU M J, PENG C, MA J J. Deep reinforcement learning based missile guidance law design for maneuvering target interception[C]// Proc. of the 40th Chinese Control Conference, 2021: 3733-3738. | 
																													
																							| 15 | KOCH W ,  MANCUSO R ,  WEST R , et al.  Reinforcement learning for UAV attitude control[J]. ACM Trans.on Cyber-Physical Systems, 2018, 3 (2): 1- 21. | 
																													
																							| 16 | LI Z ,  XIA Y ,  SU C Y , et al.  Missile guidance law based on robust model predictive control using neural-network optimization[J]. IEEE Trans.on Neural Networks & Learning Systems, 2017, 26 (8): 1803- 1809. | 
																													
																							| 17 | YANG C J, WU J, LIU G Q, et al. Ballistic missile maneuver penetration based on reinforcement learning[C]//Proc. of the IEEE CSAA Guidance, Navigation and Control Conference, 2018. | 
																													
																							| 18 | CHITHAPURAM C U ,  CHERUKURI A K ,  JEPPU Y V .  Aerial vehicle guidance based on passive machine learning technique[J]. International Journal of Intelligent Computing and Cybernetics, 2016, 9 (3): 255- 273. doi: 10.1108/IJICC-12-2015-0042
 | 
																													
																							| 19 | PANOV A I ,  YAKOVLEV K S ,  SUVOROV R .  Grid path planning with deep reinforcement learning: preliminary results[J]. Procedia Computer Science, 2018, 123, 347- 353. doi: 10.1016/j.procs.2018.01.054
 | 
																													
																							| 20 | YANG Q M ,  ZHANG J D ,  SHI G Q , et al.  Maneuver decision of UAV in short-range air combat based on deep reinforcement learning[J]. IEEE Access, 2019, 8, 363- 378. | 
																													
																							| 21 | YANG Q M, ZHU Y, ZHANG J D, et al. UAV air combat autonomous maneuver decision based on DDPG algorithm[C]//Proc. of the IEEE 15th International Conference on Control and Automation, 2019: 37-42. | 
																													
																							| 22 | ZHANG H P ,  HUANG C Q .  Maneuver decision-making of deep learning for UCAV thorough azimuth angles[J]. IEEE Access, 2020, 8, 12976- 12987. doi: 10.1109/ACCESS.2020.2966237
 | 
																													
																							| 23 | HOORN V, MARTIJ N. Optimizing air-to-air missile guidance using reinforcement learning[D]. Delft: Delft University of Technology, 2019. | 
																													
																							| 24 | HONG D ,  KIM M ,  PARK S .  Study on reinforcement learning-based missile guidance law[J]. Applied Sciences, 2020, 10 (18): 6567. doi: 10.3390/app10186567
 | 
																													
																							| 25 | SHALUMOV V .  Cooperative online guide-launch-guide policy in a target-missile-defender engagement using deep reinforcement learning[J]. Aerospace Science and Technology, 2020, 104, 105996. doi: 10.1016/j.ast.2020.105996
 | 
																													
																							| 26 | 邱月, 郑柏通, 蔡超.  多约束复杂环境下UAV航迹规划策略自学习方法[J]. 计算机工程, 2021, 47 (5): 44- 51. | 
																													
																							|  | QIU Y ,  ZHENG B T ,  CAI C .  Self-learning method of UAV track planning strategy in complex environment with multiple constraints[J]. Computer Engineering, 2021, 47 (5): 44- 51. | 
																													
																							| 27 | 高昂, 董志明, 叶红兵, 等.  基于深度强化学习的巡飞弹突防控制决策[J]. 兵工学报, 2021, 42 (5): 1101- 1110. doi: 10.3969/j.issn.1000-1093.2021.05.023
 | 
																													
																							|  | GAO A ,  DONG Z M ,  YE H B , et al.  Loitering munition penetration control decision based on deep reinforcement learning[J]. Acta Armamentarii, 2021, 42 (5): 1101- 1110. doi: 10.3969/j.issn.1000-1093.2021.05.023
 | 
																													
																							| 28 | 赵超, 文传源.  作战系统综合效能评估方法探索[J]. 电光与控制, 2001, 1, 63- 65. | 
																													
																							|  | ZHAO C ,  WEN C Y .  Exploration on comprehensive effectiveness of a campaign system[J]. Electronics Optics and Control, 2001, 1, 63- 65. | 
																													
																							| 29 | LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[EB/OL]. [2021-09-06]. https://arxiv.org/abs/1509.02971. | 
																													
																							| 30 | TAN R J, ZHOU J, DU H B, et al. An modeling processing method for video games based on deep reinforcement learning[C]// Proc. of the IEEE 8th Joint International Information Technology and Artificial Intelligence Conference, 2019: 939-942. | 
																													
																							| 31 | SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//Proc. of the 31st International Confe-rence on Machine Learning, 2014: 387-395. | 
																													
																							| 32 | SILVER D ,  HUANG A ,  MADDISON C J , et al.  Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2015, 529 (7587): 484- 489. |