Systems Engineering and Electronics

Previous Articles     Next Articles

Novel partitional clustering algorithm for large data processing

LU Zhi-mao1,2, FENG Jin-mei1,3, FAN Dong-mei2,YANG Peng1, TIAN Ye1,4   

  1. 1. Pattern Recognition and Natural Computation Laboratory, Harbin Engineering University, Harbin 150001, China;
    2. School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China;
    3. College of Electronics and Information Engineering, Heilongjiang University of Science and Technology,Harbin 150022, China;                                                      4. School of Physics & Electronic Engineering, Harbin Normal University, Harbin 150025, China
  • Online:2014-05-22 Published:2010-01-03

Abstract:

Large data processing is an inevitable problem for the internet of things research and application. To solve the shortcomings of large data processing with the common clustering methods, a novel partitional clustering method is designed. The new method determines the initial positions of natural cluster centroids by clustering the samples in sizes large enough, which are selected using the large data sampling method repeatedly. Next it updates the initial positions using the remaining data to correct the centroids positions deviating from the ideal positions. The designed partitional clustering algorithm has linear space and time complexity. The experimental results show that this new clustering algorithm can not only give better clustering results than common clustering algorithms, but also run fast and be suitable for large data clustering processing.

[an error occurred while processing this directive]