系统工程与电子技术 ›› 2025, Vol. 47 ›› Issue (3): 1019-1027.doi: 10.12305/j.issn.1001-506X.2025.03.34

• 通信与网络 • 上一篇    

基于CVAE-LSTM的服务器KPI异常检测

沈夏闰1,*, 李若楠2, 张昊田3   

  1. 1. 北京航天情报与信息研究所, 北京 100854
    2. 国家知识产权局专利局专利审查协作北京中心, 北京 100070
    3. 同济大学中德工程学院, 上海 201804
  • 收稿日期:2023-05-11 出版日期:2025-03-28 发布日期:2025-04-18
  • 通讯作者: 沈夏闰
  • 作者简介:沈夏闰 (1987—), 男, 高级工程师, 硕士, 主要研究方向为计算机
    李若楠 (1987—), 女, 助理研究员, 硕士, 主要研究方向为机械电子工程
    张昊田 (2002—), 男, 本科, 主要研究方向为机械电子工程

Server KPI anomaly detection based on CVAE-LSTM

Xiarun SHEN1,*, Ruonan LI2, Haotian ZHANG3   

  1. 1. Beijing Institute of Aerospace Information, Beijing 100854, China
    2. Patent Examination Cooperation (Beijing) Center of The Patent Office, Beijing 100070, China
    3. Sino-German College of Applied Sciences at Tongji University, Shanhai 201804, China
  • Received:2023-05-11 Online:2025-03-28 Published:2025-04-18
  • Contact: Xiarun SHEN

摘要:

对于关键性能指标(key performance indicator, KPI)的异常检测是互联网智慧运维流程中的基石, 对于故障报警和保障服务器安全具有重要意义。深度生成模型已经能很好地解决机器学习模型深度特征表征能力差的问题, 但对于KPI数据中时间信息的处理和长时信息的捕获存在不足。为此, 提出一种基于条件变分自编码器(conditional variational autoencoder, CVAE)和长短时记忆(long-short term memory, LSTM)网络相结合的KPI异常检测模型, 利用CVAE网络强大的表征能力, 并将时间信息添加到深度自编码器中, 利用LSTM的长时记忆能力, 提高模型的长时异常学习和处理能力, 使用训练好的CVAE网络来进一步训练LSTM。在3个公开的数据集上与其他深度学习模型进行对比实验, 实验结果表明, 在F1值方面, 所提模型的性能优于单独的LSTM和一些效果较好的深度学习模型。

关键词: 关键性能指标异常检测, 条件变分自编码器, 长短时记忆网络, 关键性能指标, 深度学习

Abstract:

The anomaly detection of key performance indicator (KPI) is the basis of all aspects of Internet intelligent operation and maintenance, and is of great significance for fault alarm and server security. The depth generation model has been able to solve the problem of poor depth feature representation ability of machine learning model, but it is insufficient in terms of the processing of time information in KPI data and the capture of long-term information. For this reason, a KPI anomaly detection model based on the combination of conditional variational autoencoder (CVAE) and long-short term memory (LSTM) is proposed. With the powerful representation ability of CVAE network, time information is added to deep autoencoder, and the long-term memory ability of LSTM is used to improve the long-term anomaly learning and processing ability of the proposed model. The trained CVAE network is used to further train LSTM. Through the comparison experiment with other deep learning models on three open datasets, the experimental results show that the performance of the model in this paper is better than that of the LSTM alone and some deep learning models with better results in terms of F1 value.

Key words: key performance indicator (KPI) anomaly detection, conditional variational autoencoder (CVAE), long-short term memory (LSTM) network, KPI, deep learning

中图分类号: