Systems Engineering and Electronics ›› 2024, Vol. 46 ›› Issue (6): 1867-1877.doi: 10.12305/j.issn.1001-506X.2024.06.05

• Electronic Technology • Previous Articles    

Object grasp pose detection based on the region of interest

Xiantao SUN1, Wangyang JIANG1, Wenjie CHEN1,*, Weihai CHEN2, Yali ZHI1   

  1. 1. School of Electrical Engineering and Automation, Anhui University, Hefei 230601, China
    2. School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China
  • Received:2023-03-02 Online:2024-05-25 Published:2024-06-04
  • Contact: Wenjie CHEN

Abstract:

In industrial production, the objects to be grasped often have the characteristics of varions types, messy placements, irregular shapes, etc., which make it difficult to accurately obtain the grasping pose of the object. In view of the above problems, this paper proposes a two-stage grasp pose estimation method based on deep learning. In the first stage, a lightweight rotating target detection algorithm based on improved you only look once version4 (YOLOv4) is proposed to enhance the detection speed and improve detection accuracy of targets. Firstly, the lightweight network GhostNet and deep separable convolution are used to reconstruct the original network to reduce the parameters of the entire model. Then, the adaptive spatial feature fusion structure and the non-reference attention module are added to the neck network to improve the positioning accuracy of the region of interest. Finally, the approximate skew intersection over union (SkewIoU) loss is used to solve the periodic problem of the angle. In the second stage, a mask extraction region of interest is made with the same size as the original picture. At the same time, an improved DeepLabV3+algorithm is proposed to detect the grasping pose of objects in the area of interest. Experimental results show that the detection accuracy of the improved YOLOv4 network reaches 92.5%, and the improved DeepLabV3+algorithm achieves 94.6% and 92.4% of the image splitting and object splitting accuracy on the Cornell capture dataset, respectively, and can accurately detect the grasping pose of objects.

Key words: deep learning, mask, region of interest, lightweight network, pose detection

CLC Number: 

[an error occurred while processing this directive]