基于大模型的智能系统试验数据分析技术初探

doi:10.12305/j.issn.1001-506X.2026.05.13

摘要/Abstract

摘要：

试验是检验评估军事智能系统作战能力的重要途径。本文分析军事智能系统的试验大数据特点和系统性能评估问题的复杂性，指出传统试验大数据分析方法和智能系统试验评估面临的挑战。为此，分析目前深度学习和大语言模型在大规模、多模态数据分析中的研究现状，并对其应用于智能系统试验数据分析的可行性和困难进行了分析。在此基础上，提出基于大语言模型的智能系统试验数据分析系统的总体架构和关键技术，表明大语言模型和因果推断技术是解决军事智能系统试验数据特异性与分析内容新质性矛盾的一种可行途径。

关键词: 大语言模型, 因果推断, 试验数据, 智能系统

Abstract:

Test is a crucial method for evaluating the combat capabilities of military intelligent systems. The paper analyzes the characteristics of test big data for military intelligent systems and the complexity of system performance assessment, highlighting the challenges faced by traditional test big data analysis methods and intelligent system testing evaluations. To address these issues, it examines the current research status of deep learning and large models in large-scale, multi-modal data analysis, and evaluates the feasibility and difficulties of applying them to intelligent system test data analysis. Based on this, an overall architecture and key technologies are proposed for a test data analysis system based on large models, demonstrating that large models and causality inference techniques are viable approaches to resolving the contradiction between the specificity of military intelligent system test data and the novelty of analysis content.

Key words: large language model（LLM）, causality inference, test data, intelligent system

中图分类号:

TP 274

金光, 包阳. 基于大模型的智能系统试验数据分析技术初探[J]. 系统工程与电子技术, 2026, 48(5): 1571-1580.

Guang JIN, Yang BAO. Preliminary exploration of test data analysis technology for intelligent system based on large model[J]. Systems Engineering and Electronics, 2026, 48(5): 1571-1580.

图/表 4

表1

图1

图2

图3

参考文献 81

1	TORRES M G. Net-centric systems test big data test and evaluation challenges[C]//Proc. of the 32nd Annual International Test & Evaluation Symposium, 2015.
2	NORMAN R. DoD T&E enterprise knowledge management(KM): optimizing T&E through big data principles[C]//Proc. of the 31st Annual International Test & Evaluation Symposium, 2014.
3	CHOW E, NOLAN Z, BARRETT T, et al. Artificial general intelligence technology for automated life-cycle knowledge management[C]//Proc. of the 34th Annual International Test & Evaluation Symposium, 2017.
4	SIBLEY H. The joint simulation environment: big data-analysis and reporting tool[C]//Proc. of the 33rd Annual National Test & Evaluation Conference, 2018.
5	LAPOSTA J, ROE A. Effectively applying rule-driven automated data reduction tools: capabilities to support testing of complex systems[C]//Proc. of the 31st Annual International Test & Evaluation Symposium, 2014.
6	赵蕊蕊, 于海跃, 游雅倩, 等. 无人集群试验评估现状及技术方法综述[J]. 系统工程与电子技术, 2024, 46 (2): 570- 585.
	ZHAO R R, YU H Y, YOU Y Q, et al. Review on current development and technologies of unmanned swarm test evaluation[J]. Systems Engineering and Electronics, 2024, 46 (2): 570- 585.
7	梁晓龙, 侯岳奇, 胡利平, 等. 无人集群试验评估研究现状分析及理论方法[J]. 南京航空航天大学学报, 2020, 52 (6): 846- 854.
	LIANG X L, HOU Y Q, HU L P, et al. Review on evaluation and theoretical methods of unmanned swarm test[J]. Journal of Nanjing University of Aeronautics & Astronautics, 2020, 52 (6): 846- 854.
8	沈博, 武文亮, 杨刚, 等. 基于群体OODA的无人集群系统智能评价模型及方法[J]. 航空学报, 2023, 44 (14): 328003.
	SHEN B, WU W L, YANG G, et al. Evaluation models and methods for intelligence of unmanned swarm systems based on collective OODA loop[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44 (14): 328003.
9	周国强, 穆琳, 吴家仁, 等. 基于智能体的无人机集群弹性均衡度量与仿真评估方法[J]. 航空兵器, 2022, 29 (3): 54- 60.
	ZHOU G Q, MU L, WU J R, et al. Resilience equilibrium measurement and simulation evaluation method of UAV swarm based on agent[J]. Aero Weaponry, 2022, 29 (3): 54- 60.
10	张阳, 王艳正, 司光亚. 集群式电子战无人机的OODA作战环分析与建模[J]. 火力与指挥控制, 2018, 43 (8): 31- 36. doi: 10.3969/j.issn.1002-0640.2018.08.007
	ZHANG Y, WANG Y Z, SI G Y. Analysis and modeling of OODA circle of electronic warfare group UAV[J]. Fire Control & Command Control, 2018, 43 (8): 31- 36. doi: 10.3969/j.issn.1002-0640.2018.08.007
11	齐智敏, 张海林, 杨鹏飞. 无人机集群协同侦察任务效能评估方法研究[J]. 舰船电子工程, 2023, 43 (3): 124- 127，182.
	QI Z M, ZHANG H L, YANG P F. Evaluation method research on cooperative reconnaissance mission effectiveness of intelligent UAV cluster[J]. Ship Electronic Engineering, 2023, 43 (3): 124- 127，182.
12	MULLINS G E, STANKIEWICZ P G, HAWTHORNE R C, et al. Adaptive generation of challenging scenarios for testing and evaluation of autonomous vehicles[J]. The Journal of Systems and Software, 2018, 137, 197- 215. doi: 10.1016/j.jss.2017.10.031
13	WACH P, KROMETIS J, SONANIS A, et al. Pairing Bayesian methods and systems theory to enable test and evaluation of learning-based systems[J]. Insight, 2022, 25 (4): 65- 70. doi: 10.1002/inst.12414
14	MAIER R, GRABINGER L, URLHART D, et al. Towards causal model-based engineering in automotive system safety[C]//Proc. of the International Symposium on Model-based Safety and Assessment, 2022: 116−129.
15	AHNER D K, PARSON C R, THOMPSON J L, et al. Overcoming the challenges in test and evaluation of autonomous robotic systems[J]. The ITEA Journal of Test and Evaluation, 2018, 39 (2): 86- 94.
16	张聪. 基于视觉的跨场景集群行为理解[D]. 上海: 上海交通大学, 2019.
	ZHANG C. Cross-scene crowd behavior understanding based on computer vision[D]. Shanghai: Shanghai Jiao Tong University, 2019.
17	LOPES-DE F M L, CUGNASCA C E, AMAZONAS J R A. Insights into IoT data and an innovative DWT-based technique to denoise sensor signals[J]. IEEE Sensors Journal, 2017, 18 (1): 237- 247. doi: 10.1109/jsen.2017.2767383
18	LEHTINEN J, MUNKBERG J, HASSELGREN J, et al. Noise2Noise: learning image restoration without clean data[EB/OL]. [2023-12-20]. https://arxiv.org/abs/1803.04189.
19	ZHANG X Y, FU X H, TENG D Y, et al. Physics-informed data denoising for real-life sensing systems[EB/OL]. [2023-12-20]. https://arxiv.org/abs/2311.06968.
20	高雄, 韩红旗, 王力, 等. 大规模文本分类的训练语料去噪方法研究[J]. 情报工程, 2021, 7 (4): 117- 126.
	GAO X, HAN H Q, WANG L, et al. Research on denoising method of training corpus for large-scale text classification[J]. Technology Intelligence Engineering, 2021, 7 (4): 117- 126.
21	FAN L W, ZHANG F, FAN H, et al. Brief review of image denoising techniques[EB/OL]. [2023-12-20]. https://doi.org/10.1186/s42492-019-0016-7.
22	GOYAL B, DOGRA A, AGRAWAL S, et al. Image denoising review: from classical to state-of-the-art approaches[J]. Information Fusion, 2020, 55, 220- 244. doi: 10.1016/j.inffus.2019.09.003
23	TIAN C W, FEI L K, ZHENG W X, et al. Deep learning on image denoising: an overview[J]. Neural Networks, 2020, 131, 251- 275. doi: 10.1016/j.neunet.2020.07.025
24	IZADI S, SUTTON D, HAMARNEH G. Image denoising in the deep learning era[J]. Artificial Intelligence Review, 2023, 56, 5929- 5974. doi: 10.1007/s10462-022-10305-2
25	SHEEBA M C, CHRISTOPHER C S. A review on video denoising methods[C]//Proc. of the International Conference on Recent Advances in Energy-efficient Computing and Communication, 2019.
26	DU H F. Video denoise algorithms research and analysis[M]//CAO W, OZCAN A, XIE H, et al. Computing and Data Science: 1513. Singapore: Springer, 2021.
27	ZHANG H C, DONG Y Y, XIAO C, et al. Large language models as data preprocessors[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2308.16361v1.
28	MA P C, DING R, WANG S, et al. Demonstration of InsightPilot: an LLM-empowered automated data exploration system[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2304.00477.
29	董光玲. 基于贝叶斯理论的靶场试验综合设计方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2015.
	DONG G L. Research on integrated design methods of range test based on bayesian theory[D]. Harbin: Harbin Institute of Technology, 2015.
30	姜同春. 基于多源信息融合的高精度弹道构建方法研究与应用[D]. 哈尔滨: 哈尔滨工业大学, 2014.
	JIANG T C. Method to generate high precision trajectory based on multi-source information fusion and application[D]. Harbin: Harbin Institute of Technology, 2014.
31	MOHSEN F, ALI H, EL HAJJ N, et al. Artificial intelligence-based methods for fusion of electronic health records and imaging data[J]. Scientific Reports, 2022, 12 (1): 17981. doi: 10.1038/s41598-022-22514-4
32	LIU Z Y, ZHANG J Q, HOU Y S, et al. Machine learning for multimodal electronic health records-based research: challenges and perspectives[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2111.04898.
33	BALTRUŠAITIS T, AHUJA C, MORENCY L P. Multimodal machine learning: a survey and taxonomy[J]. IEEE Trans. on Pattern Analysis & Machine Intelligence, 2019, 41(2): 423−443.
34	SRIVASTAVA N, SALAKHUTDINOV R. Multimodal learning with deep Boltzmann machines[J]. Journal of Machine Learning Research, 2014, 15, 2949- 2980.
35	GAO J, LI P, CHEN Z K, et al. A survey on deep learning for multimodal data fusion[J]. Neural Computation, 2020, 32 (5): 829- 864. doi: 10.1162/neco_a_01273
36	ZONG Y, AODHA O M, HOSPEDALES T M. Self-supervised multimodal learning: a survey[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2304.01008v2.
37	XU P, ZHU X, CLIFTON D A. Multimodal learning with transformers: a survey[J]. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2023, 45 (10): 12113- 12132. doi: 10.1109/TPAMI.2023.3275156
38	BOEHM K M, KHOSRAVI P, VANGURI R S, et al. Harnessing multimodal data integration to advance precision oncology[J]. Nature Reviews Cancer, 2022, 22 (2): 114- 126. doi: 10.1038/s41568-021-00408-3
39	ALGIRIYAGE N, PRASANNA R, STOCK K, et al. Multi-source multimodal data and deep learning for disaster response: a systematic review[J]. SN Computer Science, 2022, 3 (1): 92. doi: 10.1007/s42979-021-00971-4
40	SHOUMY N J, ANG L M, SENG K P, et al. Multimodal big data affective analytics: a comprehensive survey using text, audio, visual and physiological signals[J]. Journal of Network and Computer Applications, 2020, 149, 102447. doi: 10.1016/j.jnca.2019.102447
41	LU J, BATRA D, PARIKH D, et al. ViLBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks[EB/OL]. [2023-12-25]. https://arxiv.org/abs/1908.02265.
42	RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2103.00020.
43	RAMESH A, PAVLOV M, GOH G, et al. Zero-shot text-to-image generation[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2102.12092.
44	YU F, TANG J J, YIN W C, et al. ERNIE-ViL: knowledge enhanced vision-language representations through scene graph[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2006.16934.
45	OPENA I, ACHIAM J, ADLER S. GPT-4 technical report[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2303.08774.
46	SHEN Y L, SONG K T, TAN X, et al. HuggingGPT: solving AI tasks with ChatGPT and its friends in hugging face[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2303.17580.
47	LIU H T, LI C Y, WU Q Y, et al. Visual instruction tuning[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2304.08485.
48	YIN S K, FU C Y, ZHAO S R, et al. A survey on multimodal large language models[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2306.13549v1.
49	DE CURTÒ J, DE ZARZÀ I, CALAFATE C T. Semantic scene understanding with large language models on unmanned aerial vehicles[J]. Drones, 2023, 7 (2): 114. doi: 10.3390/drones7020114
50	TAYEBI ARASTEH S, HAN T, LOTFINIA M, et al. Large language models streamline automated machine learning for clinical studies[J]. Nature Communication, 2024, 15 (1): 1603. doi: 10.1038/s41467-024-45879-8
51	JANSEN J A, MANUKYAN A, AL KHOURY N, et al. Leveraging large language models for data analysis automation[J]. PLoS One, 2025, 20 (2): e0317084. doi: 10.1371/journal.pone.0317084
52	FANG X, XU W J, TAN F A, et al. Large language models (LLMs) on tabular data: prediction, generation, and understanding−a survey[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2402.17944.
53	LIU T Y, WANG F, CHEN M H. Rethinking tabular data understanding with large language model[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2312.16702.
54	SU J, JIANG C F, JIN X, et al. Large language models for forecasting and anomaly detection: a systematic literature review[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2402.10350.
55	JIANG Y S, PAN Z J, ZHANG X K, et al. Empowering time series analysis with large language models: a survey[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2402.03182.
56	LIU T Y, LI K X, WANG Y G, et al. Evaluating the Utilities of Foundation Models in Single-cell Data Analysis [EB/OL]. [2023-12-25]. https://www.biorxiv.org/content/10.1101/2023.09.08.555192v7.
57	ZHANG Q L, MOTT J H. An exploratory assessment of LLM’s potential toward flight trajectory reconstruction analysis[EB/OL]. [2026-04-13]. http://arxiv.org/abs/2401.06204.
58	CONNER J, MOROS L, LIVERMORE R, et al. US air force hackathon: how large language models will revolutionize USAF flight test [EB/OL]. [2023-12-25]. https://www.databricks.com/blog/us-air-force-hackathon-how-large-language-models-will-revolutionize-usaf-flight-test.
59	IMBENS G W, RUBIN D B. Causal inference for statistics, social, and biomedical sciences[M]. New York: Cambridge University Press, 2015.
60	PEARL J. Causality: models, reasoning, and inference[M]. 2nd ed. New York: Cambridge University Press, 2009.
61	HILL J L. Bayesian nonparametric modeling for causal inference[J]. Journal of Computational and Graphical Statistics, 2011, 20 (1): 217- 240. doi: 10.1198/jcgs.2010.08162
62	ATHEY S, IMBENS G. Recursive partitioning for heterogeneous causal effects[J]. Proceedings of the National Academy of Sciences, 2016, 113 (27): 7353- 7360. doi: 10.1073/pnas.1510489113
63	SHALIT U, JOHANSSON F D, SONTAG D. Estimating individual treatment effect: generalization bounds and algorithms[EB/OL]. [2023-12-25]. https://arxiv.org/abs/1606.03976.
64	YANG M Y, LIU F R, CHEN Z T, et al. CausalVAE: structured causal disentanglement in variational autoencoder[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2004.08697.
65	GOYAL A, LAMB A, HOFFMANN J, et al. Recurrent independent mechanisms[EB/OL]. [2023-12-25]. https://arxiv.org/abs/1909.10893.
66	ZHANG L, NA J M, ZHU J, et al. Spatiotemporal causal convolutional network for forecasting hourly PM2.5 concentrations in Beijing, China[J]. Computers & Geosciences, 2021, 155 (11): 104869. doi: 10.1016/j.cageo.2021.104869
67	SCHOLKOPF B, LOCATELLO F, BAUER S, et al. Toward causal representation learning[J]. Proceedings of the IEEE, 2021, 109 (5): 612- 634. doi: 10.1109/JPROC.2021.3058954
68	SHEN X W, LIU F R, DONG H Z, et al. Weakly supervised disentangled generative causal representation learning[J]. Journal of Machine Learning Research, 2022, 23 (1): 10994- 11048. doi: 10.52202/068431-2776
69	LU C C, WU Y H, HERNÁNDEZ-LOBATO J M, et al. Nonlinear invariant risk minimization: a causal approach[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2102.12353v6.
70	ZHOU X K, ZHENG X Z, SHU T, et al. Information theoretic learning-enhanced dual-generative adversarial networks with causal representation for robust OOD generalization[J]. IEEE Trans. on Neural Networks and Learning Systems, 2023, 36 (2): 2066- 2079. doi: 10.1109/tnnls.2023.3330864
71	EGAMI N, FONG C J, GRIMMER J, et al. How to make causal inferences using texts[J]. Science Advances, 2022, 8 (42): eabg2652. doi: 10.1126/sciadv.abg2652
72	KICIMAN E, NESS R, SHARMA A, et al. Causal reasoning and large language models: opening a new frontier for causality[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2305.00050v2.
73	JIN Z J, LIU J R, LYU Z H, et al. Can large language models infer causation from correlation?[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2306.05836v2.
74	LONG S, SCHUSTER T, PICHÉ A. Can large language models build causal graphs?[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2303.05279v2.
75	ZHANG C, BAUER S, BENNETT P, et al. Understanding causality with large language models: feasibility and opportunities[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2304.05524v1.
76	LYU Z H, JIN Z J, MIHALCEA R, et al. Can large language models distinguish cause from effect?[C]//Proc. of the 38th Conference on Uncertainty in Artificial Intelligence, 2022.
77	LI J X, YU L, ETTINGER A. Counterfactual reasoning: do language models need world knowledge for causal understanding?[C]//Proc. of the 36th Conference on Neural Information Processing Systems, 2022.
78	PAWLOWSKI N, VAUGHAN J, JENNINGS J, et al. Answering causal questions with augmented LLMs[C]//Proc. of the Workshop on Challenges in Deployable Generative AI at International Conference on Machine Learning, 2023.
79	YANG S, LI X, CUI L Y, et al. Neuro-symbolic integration brings causal and reliable reasoning proofs[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2311.09802v1.
80	MIALON G, DESSÌ R, LOMELI M, et al. Augmented language models: a survey[EB/OL]. [2023-12-25]. https://arxiv.org/abs/2302.07842v1.
81	LIU S, ZHAO S Q, JIA C H, et al. FinDABench: benchmarking financial data analysis ability of large language models [EB/OL]. [2023-12-25]. https://arxiv.org/abs/2401.02982.

发展潜力	LLM方法*	传统方法
准确性	（目前）较差到一般	较好（需人工寻优）
可解释性	一般	较好到好
自动化程度	好	差
数据类型	各种类型，需转化为文本或块	各种类型，需人工进行结构化
数据规模	中到大规模	小到中等规模
稀疏/缺失数据	可自动处理（目前借助工具）	需人工设计处理方案和方法
非结构化数据	自动处理	需人工转化为结构化数据
多模态集成能力	特征级、决策级	任务相关的特征级、决策级
多任务适应能力	较好（原始数据重构、预测等）	差

[1]	邱鑫源, 陈红华, 崔翛龙, 张敏, 盛萱竺. 指控领域数据中台的大语言模型应用设计[J]. 系统工程与电子技术, 2026, 48(3): 918-931.
[2]	余晨浩, 常雷雷, 周宇, 孙建彬. 基于反馈特征提取和因果反演辨识的自主智能系统效能评估方法[J]. 系统工程与电子技术, 2026, 48(1): 209-217.
[3]	潘如江, 陈炯毅, 王剑波, 方哲梅. 基于大语言模型的系统架构视图智能建模方法[J]. 系统工程与电子技术, 2025, 47(12): 3912-3923.
[4]	鲁金直, 王国新, 唐锡晋, 唐俊杰, 温跃杰, 唐剑, 张旸旸, 兰小平, 刘奇, 李俊霖, 马君达, 吴绶玄, 胡晓度. 面向空间智能的基于模型的系统工程方法[J]. 系统工程与电子技术, 2025, 47(12): 3877-3889.
[5]	朱晓敏, 陶晶晶, 马力, 刘春龙, 王顺鸽, 薛依玲, 于沂渭, 林澈, 李文姬, 庄嘉帆, 徐标, 范衠. 基于因果发现的复杂系统根因分析方法综述[J]. 系统工程与电子技术, 2025, 47(10): 3325-3352.
[6]	张海瑞1,2, 洪东跑2, 赵宇3, 李晶2. 基于变动统计的复杂系统可靠性综合评估[J]. 系统工程与电子技术, 2015, 37(5): 1213-1218.
[7]	李远，苏菲，朱华勇，沈林成. 部分可观条件下空对地打击中的动态资源分配[J]. Journal of Systems Engineering and Electronics, 2010, 32(9): 1931-1936.