Systems Engineering and Electronics ›› 2021, Vol. 43 ›› Issue (5): 1210-1217.doi: 10.12305/j.issn.1001-506X.2021.05.07

• Sensors and Signal Processing • Previous Articles     Next Articles

Depth estimation method based on monocular infrared image in VDAS

Xu LI1(), Meng DING1(), Donghui WEI2,*(), Xiaozhou WU1(), Yunfeng CAO3()   

  1. 1. College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
    2. Science and Technology on Complex System Control and Intelligent Agent Cooperation Laboratory, Beijing Electro-Mechanical Engineering Institute, Beijing 100074, China
    3. College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
  • Received:2020-08-03 Online:2021-05-01 Published:2021-04-27
  • Contact: Donghui WEI E-mail:lixunuaa@nuaa.edu.cn;nuaa_dm@nuaa.edu.cn;weidonghui2652@sina.com;wuxiaozhou@nuaa.edu.cn;cyfac@nuaa.edu.cn

Abstract:

In view of the demand of vision-based driving assistance system (VDAS) for vehicle forward-looking depth perception in low visibility scenes, a depth learning based monocular infrared image depth estimation method is proposed. In this method, an end-to-end multi-task self-monitoring learning framework is adopted, and the loss function is constructed by using the stereo geometric constraints between monocular infrared video frames, so the real depth information of the scene is not need. The minimum value of the reprojection error between the two frames is taken to solve the occlusion problem, and the influence of infrared image noise is weaken; The network decoder samples the multi-scale depth map to higher resolutions and calculates the reprojection error, which avoids the hole phenomenon in the depth map. Qualitative experiments on FLIR infrared datasets show that the proposed method can obtain pixel-level dense depth from monocular infrared images. Experiments on real roads show that the proposed method can effectively perceive the depth information of the target in the night, and the absolute error is 13.2% within 15 m. It can meet the requirements of collision avoidance in most emergency situations.

Key words: infrared image, depth estimation, convolution neural network, self-supervised learning

CLC Number: 

[an error occurred while processing this directive]