系统工程与电子技术

• 软件、算法与仿真 • 上一篇    下一篇

基于线性回归分析的快速搜索聚类中心算法

王星, 呙鹏程, 王玉冰, 程越   

  1. 空军工程大学航空航天工程学院, 陕西 西安 710038
  • 出版日期:2017-10-25 发布日期:2010-01-03

Fast searching clustering centers algorithm based on linear regression analysis

WANG Xing, GUO Pengcheng, WANG Yubing, CHENG Yue   

  1. Aeronautics and Astronautics Engineering College, Air Force Engineering University, Xi’an 710038, China
  • Online:2017-10-25 Published:2010-01-03

摘要:

针对一种可快速搜索和寻找到聚类密度峰值点聚类算法的缺陷,利用线性回归与残差分析的方法进行改进,可自动、快速地确定聚类中心且优化样本点密度值。算法利用样本点的近邻信息重新度量点的密度值,提高聚类中心点位置稳定性;利用一元线性回归与残差分析,快速、自动地选出聚类中心点,去除了人为选择的主观性。通过理论分析以及在人工数据集和真实数据集的对比实验表明,提出的基于线性回归分析的快速搜索聚类中心算法能够克服原有算法的缺陷,并且在聚类效果和计算时间上优于原有算法、基于密度的带有噪声的空间聚类算法(density based spatial clustering of applications with noise, DBSCAN)以及K-means算法。

Abstract:

To deal with the deficiencies of a clustering algorithm for fast finding and searching of density peaks, an automatically and fast finding of clustering centers clustering algorithm is proposed, which adopts linear regression and residual analysis and optimizes sample density values. The algorithm uses sample’s nearest neighbors information to measure point density for improving clustering centers stability, and it uses linear regression and residual analysis to choose clustering centers fast and automatically and removes subjectivity of artificial selection. Theoretical analysis and contrast experiments show that the proposed algorithm can overcome deficiencies of the original algorithm, and the results of clustering and calculation time is better than the original algorithm, the density based spatial clustering of applications with noise (DBSCAN) and the K-means algorithm.