Systems Engineering and Electronics ›› 2022, Vol. 44 ›› Issue (2): 410-419.doi: 10.12305/j.issn.1001-506X.2022.02.07

• Electronic Technology • Previous Articles     Next Articles

Target tracking network based on dual-modal interactive fusion under attention mechanism

Yunxiang YAO, Ying CHEN*   

  1. College of Computer Internet of Things, Jiangnan University, Wuxi 214122, China
  • Received:2021-01-28 Online:2022-02-18 Published:2022-02-24
  • Contact: Ying CHEN

Abstract:

Aiming at the challenges of current object tracking that is difficult to low illusion, motion blur, and fast motion, a dual-modal interacive fusion tracking network of infrared and visible under spatial channel attention is proposed. First, the infrared and RGB images are extracted through the backbone three-layer convalution to extract layered features which are normalized to the same resolution via dimension reduction. The three-layer features are cascaded to form each modal feature. Then the features are sent to the designed spatial channel self-attention module and the cross-module interactive attention module which lead network focus on global spatial features and high-response channels and therefore improve the complementarity of the dual-modal information. The interacted features of the dual-modal are cascaded for the fusion and finally sent to three fully connected layers to complete the target tracking. The experimental results of the largest RGB-Themeral (RGB-T) tracking data set RGBT234 show that the proposed network can effectively extract dual-modal interactive features and improve target tracking accuracy. Its Precision/Success Rateis improced by 5.3% and 4.2%, respectively, compared with the baseline network.

Key words: RGB-Themeral (RGB-T), object tracking, deep learning, attention fusion

CLC Number: 

[an error occurred while processing this directive]