Journal of Systems Engineering and Electronics ›› 2013, Vol. 35 ›› Issue (8): 1769-1776.doi: 10.3969/j.issn.1001-506X.2013.08.31

• 软件、算法与仿真 • 上一篇    下一篇

集成粗糙集和阴影集的簇特征加权模糊聚类算法

王丽娜1,2,3,王建东3,李涛1,2,叶枫3,4   

  1. 1.南京信息工程大学江苏省气象探测与信息处理重点实验室, 江苏 南京 210044;
    2.南京信息工程大学电子与信息工程学院, 江苏 南京 210044;
    3.南京航空航天大学计算机科学与技术学院, 江苏 南京 210016;
    4.河海大学计算机与信息学院, 江苏 南京 211100

  • 出版日期:2013-08-20 发布日期:2010-01-03

Cluster’s feature weighting fuzzy clustering algorithm integrating rough sets and shadowed sets

WANG Li-na1,2,3, WANG Jian-dong3, LI Tao1,2,YE Feng3,4   

  1. 1. Jiangsu Key Laboratory of Meteorological Observation and Information Processing, Nanjing University of Information Science and Technology, Nanjing 210044, China; 2.College of Electronic and Information Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China; 3.College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China; 4. Computer and Information College, Hohai University, Nanjing 211100, China
  • Online:2013-08-20 Published:2010-01-03

摘要:

特征加权是聚类算法中的常用方法,决定权值对产生一个有效划分非常关键。基于模糊集、粗糙集和阴影集的粒计算框架,本文提出计算不同簇特征权重的聚类新方法,特征权值随着每次迭代自动地计算。每个簇采用不同的特征权重可以更有效地实现聚类目标,并使用聚类有效性指标包括戴维斯-Bouldin指标(Davies-Bouldin,DB)、邓恩指标(Dunn, Dunn)和Xie-Beni指标(Xie-Beni, XB)分析基于划分的聚类有效性。真实数据集上的实验表明这些算法总是收敛的,而且对交叠的簇划分更有效,同时在噪声和异常数据存在时具有鲁棒性。

Abstract:

Associating feature with weights for each cluster is a common approach in clustering algorithms and determining the weight values is crucial in generating valid partition. This paper introduces a novel method in the framework of granular computing that incorporates fuzzy sets, rough sets, and shadowed sets, and calculates feature weights at each iteration automatically. The method of feature weighting can realize the clustering objective more effectively, and the clustering validity indices of DB, Dunn and XB are applied to analyze the validity of partition-based clustering. Comparative experiments results reported for real data sets illustrate that the proposed algorithms are always convergent and more effective in handing overlapping among clusters and more robust in the presence of noisy data and outlier.