Journal of Systems Engineering and Electronics ›› 2009, Vol. 31 ›› Issue (5): 1235-1240.

• 软件、算法与仿真 • 上一篇    下一篇

在线挖掘数据流滑动窗口中频繁闭项集

敖富江1, 杜静2, 颜跃进2, 黄柯棣1   

  1. 1. 国防科学技术大学机电工程与自动化学院, 湖南, 长沙, 410073;
    2. 国防科学技术大学计算机学院, 湖南, 长沙, 410073
  • 收稿日期:2008-01-17 修回日期:2008-06-24 出版日期:2009-05-20 发布日期:2010-01-03
  • 作者简介:敖富江(1975- ),男,博士研究生,主要研究方向为数据仓库,数据挖掘和复杂大系统仿真.E-mail:fjao@nudt.edu.cn
  • 基金资助:
    国家自然科学基金资助课题(60704038)

Online mining closed frequent itemsets over a stream sliding window

AO Fu-jiang1, DU Jing2, YAN Yue-jin2, HUANG Ke-di1   

  1. 1. Coll. of Mechanical Engineering and Automation, National Univ. of Defense Technology, Changsha 410073, China;
    2. School of Computer Science, National Univ. of Defense Technology, Changsha 410073, China
  • Received:2008-01-17 Revised:2008-06-24 Online:2009-05-20 Published:2010-01-03

摘要: 在线挖掘滑动窗口中的频繁闭项集是一类重要的数据流挖掘问题.提出了一种新的频繁闭项集挖掘算法FPCFI-DS.该算法能够在有限的存储空间中高速挖掘数据流滑动窗口中的频繁闭项集,并且能够在任意时刻维护当前窗口中精确的频繁闭项集.对于第一个窗口中的数据,FPCFI-DS算法采用单遍过程FPCFI进行挖掘,挖掘结果被保存于一棵全局闭项集树GCT中.当窗口向前滑动时,FPCFI-DS算法采用更新挖掘方式快速挖掘出当前窗口中的频繁闭项集.实验结果表明,FPCFI-DS算法的空间效率和时间效率都显著优于同类经典算法Moment.

Abstract: Online mining closed frequent itemsets in sliding window is one of the most important issues for mining data streams.A novel algorithm,FPCFI-DS,is proposed,which can efficiently mine closed frequent itemsets over a stream sliding window with limited memory space,and maintain exact closed frequent itemsets in current window at any time.For data in the first window,the algorithm FPCFI-DS mines closed frequent itemsets using single-pass procedure,denoted as FPCFI.The resulting closed frequent itemsets are stored in a global closed frequent itemsets tree(GCT).When the window slides forward,the FPCFI-DS quickly updates closed frequent itemsets in current window using the updating-mining method.The experimental results show that FPCFI-DS is superior to that of state-of-the-art algorithm Moment in terms of time and space efficiency.

中图分类号: