摘要: |
针对图像语义分割中目标边界容易混淆、定位不准以及边界不平滑问题,在Deeplab v2 Resnet-101网络的基础上引入提出的逆注意层与像素相似度学习层,构造了一种新的语义分割的网络结构,并设计了注意力层和像素相似度学习层的损失函数。首先,使用Deeplab v2 Resnet-101网络提取图像语义特征;然后,利用提出的逆注意力层修正预测网络的分割结果,同时,利用提出的像素相似度学习层解决边界不够平滑的问题;最后融合两者分割的结果,得到语义分割的结果。在PASCAL-Context上取得了像素准确度76.2%、像素平均准确度59.7%、平均IoU(Intersection over Union)准确度指标49.9%的结果,在PASCAL Person-Part、NYUDv2、MIT ADE20K数据集上分别取得了平均IoU准确度指标69.6%、42.1%、44.38%的结果,与已有的主流方法相比,所提算法能够提升语义分割的精确度,验证了算法的有效性。 |
关键词: 图像语义分割 逆注意力机制 相似度学习 卷积神经网络 |
DOI: |
|
基金项目: |
|
Image Segmentation Based on Inverse Attention Mechanism and Pixel Similarity Learning |
XIANG Tao,QIAO Wensheng,DENG Yongxing,WANG Yanbin |
(Southwest China Institute of Electronic Technology,Chengdu 610036,China;Unit 78125 of PLA,Chengdu 610036,China) |
Abstract: |
To solve the problems of inaccuracy and unsmoothness of the object boundary in image semantic segmentation,the reverse attention mechanism and pixel similarity learning are incorporated into a new network architecture based on the Deeplab v2 Resnet-101,and loss functions for the reverse attention mechanism and the pixel similarity learning are designed respectively.First,the Deeplab v2 Resnet-101 is used to extract semantic features.Then,the proposed reverse attention layer is used to modify the segmentation results of the prediction network,and the proposed pixel similarity learning layer is used to smooth the segmentation of boundaries.Finally,the results of semantic segmentation are obtained by merging the two segmentation results.The proposed method is evaluated on 4 datasets.In the Pascal context,the method achieves 76.2% of the pixel accuracy,59.7% of the pixel average accuracy and 49.9% of the average intersection over union(IoU) accuracy.In the Pascal Person-Part,the NYUDv2 and the MIT ADE20K,the method achieves 69.6%,42.1% and 44.38% of the average IOU accuracy respectively.Compared with the existing methods,the proposed method can improve the accuracy of semantic segmentation and the experimental results verify the effectiveness of the method. |
Key words: image semantic segmentation reverse attention mechanism similarity learning convolutional neural network |