Journal of Systems Engineering and Electronics ›› 2013, Vol. 35 ›› Issue (1): 207-211.doi: 10.3969/j.issn.1001-506X.2013.01.35

• 软件、算法与仿真 • 上一篇    下一篇

基于KL距离的交互式动态影响图近似算法

田乐, 罗键, 曹浪财, 陈志平   

  1. 厦门大学信息科学与技术学院, 福建 厦门 361005
  • 出版日期:2013-01-23 发布日期:2010-01-03

Approximate algorithm of interactive dynamic influence diagrams based on KL distance

TIAN Le, LUO Jian, CAO Lang-cai, CHEN Zhi-ping   

  1. School of Information Science and Technology, Xiamen University, Xiamen 361005, China
  • Online:2013-01-23 Published:2010-01-03

摘要:

交互式动态影响图(interactive dynamic influence diagrams, I-DIDs)状态空间太大,候选模型的数量随时间变化而呈指数倍增长。针对其备受计算量困扰的问题,提出一种利用近似行为等价原理与区别模型更新算法(discriminative model updates,DMU)相结合的近似算法。首先给出了基于Kullback-Leibler (KL)距离模型行为等价和近似行为等价的定义,然后基于KL 距离和候选模型的动作对候选模型聚类,自上而下合并策略树形成策略图,最后利用DMU算法进行求解。仿真结果表明,相对于传统的DMU算法,所提近似算法能显著降低候选模型的数量,提高I-DIDs的效率,对I-DIDs的理论及应用研究具有参考价值。

Abstract:

The model space of interactive dynamic influence diagrams (I-DIDs) is too large and the number of candidate models grows exponentially with the number of time steps. To deal with the high calculation cost issue, a method of solving I-DIDs approximately that combines approximate behavioral principle and discriminative model update algorithm (DMU) is proposed. First, a new definition of behavior equivalence and approximate behavior equivalence of models are presented. Then the candidate models based on the Kullback-Leibler (KL) distance and the action of candidate models are clustered. Afterwards, the top to bottom method is used to merge policy trees into policy graphs. Finally, I-DIDs are solved by using the approach of DMU. The simulation results show that the approximated algorithm can dramatically decrease the number of candidate model and improve the efficiency compared with the traditional DMU algorithm. This research work should be valuable in the research and application of I-DIDs.