Systems Engineering and Electronics ›› 2023, Vol. 45 ›› Issue (6): 1755-1761.doi: 10.12305/j.issn.1001-506X.2023.06.19

• Systems Engineering • Previous Articles    

Reinforcement learning technology based on asymmetric unobservable state

Xinzhi LI1,*, Shengbo DONG1, Xiangyang CUI2   

  1. 1. Beijing Institute of Remote Sensing Equipment, Beijing 100854, China
    2. State Key Laboratory of Communication Content Cognition, Beijing 100733, China
  • Received:2021-09-03 Online:2023-05-25 Published:2023-06-01
  • Contact: Xinzhi LI

Abstract:

In real dynamic game scenarios, there are characteristics such as unequal information, various working mechanisms, and different rules between adversaries. However, the existing reinforcement learning algorithms use approximate model fitting by assuming that the state is fully observable or partially observable. Therefore, it is hard to establish assumptions when it is hard to accurately obtain or unable to obtain the status information of the other party, which result in existing reinforcement learning models that cannot be directly applied. To solve this problem, a new framework based on asymmetric unobservable reinforcement learning is proposed. Under this framework, agents can achieve online learning only based on value feedback. In order to verify the feasibility and versatility of the proposed framework, three typical reinforcement learning algorithms are transplanted into the proposed algorithm framework, and a game confrontation model is built for comparative verification. The results show that the three algorithms can be successfully applied to dynamic game environments with unobservable states, and the convergence speed is greatly improved, which proves the feasibility and versatility of the proposed framework.

Key words: reinforcement learning, dynamic game, asymmetric unobservable state

CLC Number: 

[an error occurred while processing this directive]