系统工程与电子技术

• 软件、算法与仿真 • 上一篇    下一篇

基于数据加权策略的模糊C均值聚类算法

周世波1,2, 徐维祥1, 柴田2   

  1. 1. 北京交通大学交通运输学院, 北京 100044; 2. 集美大学航海学院, 福建 厦门 361021
  • 出版日期:2014-11-03 发布日期:2010-01-03

Data weighted fuzzy C means clustering algorithm

ZHOU Shi-bo1,2, XU Wei-xiang1, CHAI Tian2   

  1. 1. School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China;
    2. Navigation Colledge, Jimei University, Xiamen 361021, China
  • Online:2014-11-03 Published:2010-01-03

摘要:

针对模糊C均值(fuzzy C means, FCM)聚类算法没有考虑噪声样本点和样本数据的分布特征对聚类结果影响的不足,利用数据加权策略对FCM聚类算法进行改进。改进后的算法通过计算各样本点的密度值,将初始聚类中心限制在高密度样本点区域,并把样本点的密度值作为该点的权值,对聚类中心进行调整,突出高密度样本点在聚类中心调整中的影响力,从而达到提高聚类效果的目的。人造数据集和加州大学欧文分校(University of California Irvine, UCI)真实数据集的实验结果表明,在不提高时间复杂度的同时,与FCM算法相比,基于数据加权策略的FCM算法聚类的准确率更高。

Abstract:

Focusing on the fuzzy C means (FCM)algorithm’s shortcomings in terms of insufficiency caused by noisy sample points and data distribution characteristics to clustering results, FCM clustering is improved by the dataweighted strategy. For better effect, the improved algorithm limits clustering centres in high density areas,adjusts the clustering centres by density values adjustment, and highlight high density sample points’ influence in the clustering center adjustment. Simulation results with man made data and UCI real data show that the dataweighted FCM algorithm has higher accuracy without the increase of time complexity in contrast to the FCM algorithms.