系统工程与电子技术 ›› 2019, Vol. 41 ›› Issue (5): 964-971.doi: 10.3969/j.issn.1001-506X.2019.05.05

• 电子技术 • 上一篇    下一篇

异常数据恒虚警检测的非参数方法

张一迪, 王培志, 陆起涌, 张建秋   

  1. 复旦大学信息科学与工程学院电子工程系智慧网络与系统研究中心, 上海 200433
  • 出版日期:2019-04-30 发布日期:2019-04-26

Nonparametric detection of anomalous data with given constant false alarm rate

ZHANG Yidi, WANG Peizhi, LU Qiyong, ZHANG Jianqiu   

  1. Research Center of Smart Networks and Systems, Department of Electronics Engineering, School of
    Information Science and Technology, Fudan University, Shanghai 200433, China
  • Online:2019-04-30 Published:2019-04-26

摘要:

针对异常数据和/或数据序列的检测,根据再生核希尔伯特空间最大平均偏差异常数据和/或数据序列检测算法,发展出了一种恒虚警检测异常的非参数方法。将来自正常数据的最大平均偏差描述成了一个统计分布,分析表明:奈曼—皮尔逊假设检验可利用这个分布来进行异常假设检验,而bootstrap重采样技术或期望最大算法则可估计出正常数据或数据序列的统计分布,尽管在给定虚警率的条件下,异常假设检验所需的判决门限可由估计到的统计分布计算获得,但可以利用蒙特卡罗积分的方法来简化这个计算。数值仿真的结果验证了提出方法的有效性,同时,表明所提方法优于文献中报道的方法。

关键词: 最大平均偏差, 恒虚警率, 异常检测, bootstrap重采样, 期望最大算法, 蒙特卡罗方法

Abstract:

In order to detect the outliers and/or anomalous data streams in data, a nonparametric method with a given constant false alarm rate under the maximum mean discrepancy and the mean embedding of distributions of the reproducing kernel Hilbert space is proposed. The normal data streams are first described as a probability distribution. Then, analyses show that the Neyman-Pearson hypothesis test via the described distribution can be exploited to do anomalous hypothesis tests with a given constant false alarm rate for data. It is also shown that the bootstrap resampling technique and/or expectation maximization algorithm can be used to estimate the distribution of the data streams. The judgment threshold required by the anomalous hypothesis test for the given constant false alarm rate can conveniently be obtained by the Monte-Carlo method so that the complex calculation for getting the threshold from the estimated distribution is simplified. Numerical simulation results verify the effectiveness of the proposed method and its superiority over other reported methods.

Key words: maximum mean discrepancy (MMD), constant false alarm rate (CFAR), anomaly detection, bootstrap resampling, xpectation-maximization (EM) algorithm, Monte-Carlo method