Journal of Systems Engineering and Electronics ›› 2011, Vol. 33 ›› Issue (6): 1370-1376.doi: 10.3969/j.issn.1001-506X.2011.06.34

• 制导、导航与控制 • 上一篇    下一篇

基于遗传算法的Skinner操作条件反射学习模型

蔡建羡1,2, 阮晓钢1   

  1. 1. 北京工业大学电子信息与控制工程学院, 北京 100124; 2. 防灾科技学院, 河北 三河 065201
  • 出版日期:2011-06-20 发布日期:2010-01-03

Skinner operant conditioning learning model based on genetic algorithm

CAI Jian-xian1,2, RUAN Xiao-gang1   

  1. 1. School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China;
    2. Institute of Disaster Prevention, Sanhe 065201, China
  • Online:2011-06-20 Published:2010-01-03

摘要:

以概率自动机(probabilistic automata, PA)为平台,结合遗传算法(genetic algorithm, GA)的进化思想,设计了反映Skinner操作条件反射(operant conditioning, OC)思想的仿生学习模型,称为基于遗传算法的操作条件反射概率自动机(genetic algorithm-operant conditioning probabilistic automata,GA-OCPA)学习系统。每一次学习尝试之后,首先,学习系统把通过OC学习算法学习得到的信息熵值作为个体适应度;然后,执行遗传算法,搜索最优的个体;最后,再执行OC学习算法学习最优个体内的最优操作行为,以得到新的信息熵值。理论上分析了GA-OCPA学习系统学习算法的收敛性,通过对两轮机器人运动平衡控制的仿真分析,表明设计的GA-OCPA学习系统的学习是一个自动获取知识和提炼的过程,具有高度的自适应能力。

Abstract:

Platform on probabilistic automata and combined with evolution thought of genetic algorithm, thispaper constructs a bionic learning model which can reflect the essence of Skinner operant conditioning. The designed learning model is named as genetic algorithm-operant conditioning probabilistic automaton (GA-OCPA) bionic autonomous learning system. After each learning trial, the learning system firstly obtains the information entropy value based on operant conditioning (OC) learning result and uses it as the fitness of individual. And then genetic algorithm is performed based on information entropy value to find the optimal individual. At last, the OC learning algorithm is performed to learn the optimal operant action in optimal individual, and correspondingly a new information entropy value will be obtained. The convergence theorems for the learning algorithm of GA-OCPA bionic learning system is presented, and the simulation analyses in motion balancing control of two-wheeled robot demonstrate that the learning of GA-OCPA bionic learning system is a process of autonomously acquiring and epurating knowledge and has high adaptive ability.