Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
Y. Du, Z. Fu, Q. Liu, and Y. Wang. (2021)cite arxiv:2110.07110Comment: 10 pages, 5 figures. Accepted by CVPR'22.
Abstract
Though image-level weakly supervised semantic segmentation (WSSS) has
achieved great progress with Class Activation Maps (CAMs) as the cornerstone,
the large supervision gap between classification and segmentation still hampers
the model to generate more complete and precise pseudo masks for segmentation.
In this study, we propose weakly-supervised pixel-to-prototype contrast that
can provide pixel-level supervisory signals to narrow the gap. Guided by two
intuitive priors, our method is executed across different views and within per
single view of an image, aiming to impose cross-view feature semantic
consistency regularization and facilitate intra(inter)-class
compactness(dispersion) of the feature space. Our method can be seamlessly
incorporated into existing WSSS models without any changes to the base networks
and does not incur any extra inference burden. Extensive experiments manifest
that our method consistently improves two strong baselines by large margins,
demonstrating the effectiveness. Specifically, built on top of SEAM, we improve
the initial seed mIoU on PASCAL VOC 2012 from 55.4% to 61.5%. Moreover, armed
with our method, we increase the segmentation mIoU of EPS from 70.8% to 73.6%,
achieving new state-of-the-art.
Description
Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
%0 Generic
%1 du2021weakly
%A Du, Ye
%A Fu, Zehua
%A Liu, Qingjie
%A Wang, Yunhong
%D 2021
%K segmentation
%T Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
%U http://arxiv.org/abs/2110.07110
%X Though image-level weakly supervised semantic segmentation (WSSS) has
achieved great progress with Class Activation Maps (CAMs) as the cornerstone,
the large supervision gap between classification and segmentation still hampers
the model to generate more complete and precise pseudo masks for segmentation.
In this study, we propose weakly-supervised pixel-to-prototype contrast that
can provide pixel-level supervisory signals to narrow the gap. Guided by two
intuitive priors, our method is executed across different views and within per
single view of an image, aiming to impose cross-view feature semantic
consistency regularization and facilitate intra(inter)-class
compactness(dispersion) of the feature space. Our method can be seamlessly
incorporated into existing WSSS models without any changes to the base networks
and does not incur any extra inference burden. Extensive experiments manifest
that our method consistently improves two strong baselines by large margins,
demonstrating the effectiveness. Specifically, built on top of SEAM, we improve
the initial seed mIoU on PASCAL VOC 2012 from 55.4% to 61.5%. Moreover, armed
with our method, we increase the segmentation mIoU of EPS from 70.8% to 73.6%,
achieving new state-of-the-art.
@misc{du2021weakly,
abstract = {Though image-level weakly supervised semantic segmentation (WSSS) has
achieved great progress with Class Activation Maps (CAMs) as the cornerstone,
the large supervision gap between classification and segmentation still hampers
the model to generate more complete and precise pseudo masks for segmentation.
In this study, we propose weakly-supervised pixel-to-prototype contrast that
can provide pixel-level supervisory signals to narrow the gap. Guided by two
intuitive priors, our method is executed across different views and within per
single view of an image, aiming to impose cross-view feature semantic
consistency regularization and facilitate intra(inter)-class
compactness(dispersion) of the feature space. Our method can be seamlessly
incorporated into existing WSSS models without any changes to the base networks
and does not incur any extra inference burden. Extensive experiments manifest
that our method consistently improves two strong baselines by large margins,
demonstrating the effectiveness. Specifically, built on top of SEAM, we improve
the initial seed mIoU on PASCAL VOC 2012 from 55.4% to 61.5%. Moreover, armed
with our method, we increase the segmentation mIoU of EPS from 70.8% to 73.6%,
achieving new state-of-the-art.},
added-at = {2022-07-17T16:31:56.000+0200},
author = {Du, Ye and Fu, Zehua and Liu, Qingjie and Wang, Yunhong},
biburl = {https://www.bibsonomy.org/bibtex/2c3dd7b083aaf53fda73533a052901bb8/redtedtezza},
description = {Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast},
interhash = {f936d1eb4c69f6a4a46a2a1c962c7e8f},
intrahash = {c3dd7b083aaf53fda73533a052901bb8},
keywords = {segmentation},
note = {cite arxiv:2110.07110Comment: 10 pages, 5 figures. Accepted by CVPR'22},
timestamp = {2022-07-17T16:31:56.000+0200},
title = {Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast},
url = {http://arxiv.org/abs/2110.07110},
year = 2021
}