Systems Engineering and Electronics ›› 2024, Vol. 46 ›› Issue (5): 1703-1711.doi: 10.12305/j.issn.1001-506X.2024.05.23

• Systems Engineering • Previous Articles    

Full process parallel genetic algorithm for Bayesian network structure learning

Yiming CAI, Li MA, Hengyang LU, Wie FANG   

  1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
  • Received:2023-03-21 Online:2024-04-30 Published:2024-04-30
  • Contact: Wie FANG

Abstract:

To solve the problem of algorithm performance degradation in Bayesian network (BN) structure learning in case of massive data, a full process parallel genetic algorithm (GA) for BN structure learning is proposed based on the Spark framework (SparkGA-BN). SparkGA-BN includes three parts: parallel calculation of mutual information, parallelization of genetic operators, and parallelization of fitness evaluation. Parallel computation of mutual information is employed to reduce the search space. Broadcasting is used to perform selection operation on the entire population by propagating population information and selection information before evolution. Selection and crossover operators share selection information to evolve efficiently and reduce disk write time. Intermediate data generated during the constraint and scoring stages are stored in memory to improve data reuse and overall execution efficiency. Experimental results show that the proposed algorithm outperforms the comparison algorithms in terms of execution efficiency and learning accuracy.

Key words: Bayesian network (BN), structure learning, genetic algorithm (GA), parallel structure learning, Spark

CLC Number: 

[an error occurred while processing this directive]