T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. (2020)cite arxiv:2002.05709Comment: ICML'2020. Code and pretrained models at https://github.com/google-research/simclr.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly and 2 other author(s). (2020)cite arxiv:2010.11929Comment: Fine-tuning code and pre-trained models are available at https://github.com/google-research/vision_transformer. ICLR camera-ready version with 2 small modifications: 1) Added a discussion of CLS vs GAP classifier in the appendix, 2) Fixed an error in exaFLOPs computation in Figure 5 and Table 6 (relative performance of models is basically not affected).
P. Dhariwal, and A. Nichol. (2021)cite arxiv:2105.05233Comment: Added compute requirements, ImageNet 256$\times$256 upsampling FID and samples, DDIM guided sampler, fixed typos.
R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. (2016)cite arxiv:1610.02391Comment: This version was published in International Journal of Computer Vision (IJCV) in 2019; A previous version of the paper was published at International Conference on Computer Vision (ICCV'17).
S. Xie, and Z. Tu. (2015)cite arxiv:1504.06375Comment: v2 Add appendix A for updated results (ODS=0.790) on BSDS-500 in a new experiment setting. Fix typos and reorganize formulations. Add Table 2 to discuss the role of deep supervision. Add links to publicly available repository for code, models and data.
K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick. (2019)cite arxiv:1911.05722Comment: CVPR 2020 camera-ready. Code: https://github.com/facebookresearch/moco.
X. Wang, R. Girshick, A. Gupta, and K. He. (2017)cite arxiv:1711.07971Comment: CVPR 2018, code is available at: https://github.com/facebookresearch/video-nonlocal-net.