一种基于信息熵的强化学习算法

doi:10.3969/j.issn.1001-506X.2010.05.035

Journal of Systems Engineering and Electronics ›› 2010, Vol. 32 ›› Issue (5): 1043-1046.doi: 10.3969/j.issn.1001-506X.2010.05.035

一种基于信息熵的强化学习算法

赵昀，陈庆伟，胡维礼

(南京理工大学自动化学院，江苏南京 210094)

出版日期:2010-05-24 发布日期:2010-01-03

Reinforcement learning algorithm based on information entropy

ZHAO Yun, CHEN Qing-wei, HU Wei-li

(School of Automation, Nanjing Univ. of Science and Technology, Nanjing 210094, China)

Online:2010-05-24 Published:2010-01-03

摘要/Abstract

摘要：

针对强化学习中探索和利用之间的平衡控制问题，提出了一种基于信息熵的强化学习算法。该算法利用信息熵的概念，定义了一种新的状态重要性测度，度量了状态与目标之间的关联程度，据此设计了一种探索机制，用于自适应调节学习过程中探索和利用之间的平衡；通过设置可变测度阈值的方法，对状态空间进行自主删减，最终生成合适的、规模较小的状态空间，从而大大节约了计算资源，提高了学习速度。仿真结果表明，所提算法具有较好的学习性能。

Abstract:

To control the balance between exploration and exploitation, a reinforcement learning algorithm based on information entropy is proposed. A new state importance measure is defined from information entropy and is applied to measure the interrelatedness between state and objectives. Based on this new measure, an exploration mechanism is designed for adjusting the balance between exploration and exploitation adaptively. In addition, an autonomic reduction method is obtained by setting the variable threshold of measure, the size of state space can gradually reduce to a small and adapt space, which will save computing resource and accelerate learning speed. Simulation results indicate the good learning performance of the presented reinforcement learning algorithm.

赵昀，陈庆伟，胡维礼. 一种基于信息熵的强化学习算法[J]. Journal of Systems Engineering and Electronics, 2010, 32(5): 1043-1046.

ZHAO Yun, CHEN Qing-wei, HU Wei-li. Reinforcement learning algorithm based on information entropy[J]. Journal of Systems Engineering and Electronics, 2010, 32(5): 1043-1046.

一种基于信息熵的强化学习算法

Reinforcement learning algorithm based on information entropy

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

编辑推荐

Metrics

本文评价