系统工程与电子技术 ›› 2024, Vol. 46 ›› Issue (3): 771-778.doi: 10.12305/j.issn.1001-506X.2024.03.01

• 电子技术 •    下一篇

基于可变尺度先验框的声呐图像目标检测

黄思佳1,2, 宋纯锋3, 李璇1,2,*   

  1. 1. 中国科学院声学研究所水下航行器信息技术重点实验室, 北京 100190
    2. 中国科学院大学电子电气与通信工程学院, 北京 100049
    3. 中国科学院自动化研究所智能感知与计算研究中心, 北京 100190
  • 收稿日期:2023-04-20 出版日期:2024-02-29 发布日期:2024-03-08
  • 通讯作者: 李璇
  • 作者简介:黄思佳(1999—), 女, 硕士研究生, 主要研究方向为图像目标检测
    宋纯锋(1989—), 男, 助理研究员, 博士, 主要研究方向为模式识别、计算机视觉技术
    李璇(1983—), 女, 研究员, 博士, 主要研究方向为AUV智能信号处理、高分辨参数估计

Target detection in sonar images based on variable scale prior frame

Sijia HUANG1,2, Chunfeng SONG3, Xuan LI1,2,*   

  1. 1. Key Laboratory of Underwater Vehicle Information Technology, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
    2. College of Electronics, Electrical and Telecommunications Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
    3. Intelligent Sensing and Computing Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2023-04-20 Online:2024-02-29 Published:2024-03-08
  • Contact: Xuan LI

摘要:

利用深度学习对声呐图像进行目标检测是近年来的研究热点, 然而声呐图像存在目标尺度分布集中、数据获取难等问题, 导致检测效果难以满足需求。针对该问题, 提出了一种基于可变尺度先验框的目标检测方法。首先, 考虑到声呐图像中目标的尺度分布具有其特殊性, 基于先验统计生成可变尺度先验框。其次, 为了解决声呐图像稀缺的难题, 采用数据增强的方法对训练集进行扩充。最后, 探索了模型的轻量化, 通过删减模型的大目标检测层, 在不降低模型精度的同时简化模型结构。为了评估算法的有效性, 以前视声呐图像为例进行了综合试验, 平均精度(mean average precision, mAP)@0.75和mAP@0.5:0.95分别达0.585和0.559, 较原Yolov5网络分别提升了5.8%和3.1%, 同时每秒10亿次浮点运算次数下降到14.9。结果表明,所提算法具有更高的精度和更轻量化的模型结构。

关键词: 声呐图像, 目标检测, 数据增强, 尺度聚类, 轻量化模型

Abstract:

In recent years, target detection in sonar images using deep learning has become a hot research topic. However, sonar images have problems such as the concentration of target scale distribution and the difficulty of data acquisition, which makes the detection effect difficult to meet the requirements. A target detection method based on variable scale prior frame is proposed to address this issue. Firstly, considering the particularity of target scale distribution in sonar images, variable scale prior frames are generated based on prior statistics. Secondly, in order to solve the problem of sonar image scarcity, data augmentation methods are used to expand the training set. Finally, the lightweighting of the model is explored by deleting the large object detection layer of the model, simplifying the model structure without reducing model accuracy. In order to evaluate the effectiveness of the algorithm, comprehensive experiments are conducted on forward-looking sonar images as an example to determine the mean average precision (mAP)@0.75 and mAP@0.5:0.95 reached 0.585 and 0.559 respectively, which increased by 5.8% and 3.1% compared to the original Yolov5 network, while giga floating-point operations (GFLOPs) decreased to 14.9. The results show that the proposed algorithm has higher accuracy and a light weight model structure.

Key words: sonar image, target detection, data enhancement, scale clustering, lightweight model

中图分类号: