copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

A. Baevski, A. Babu, W. Hsu, and M. Auli. https://ai.facebook.com/blog/ai-self-supervised-learning-data2vec/, (2022)cite arxiv:2212.07525.

Abstract

Current self-supervised learning algorithms are often modality-specific and require large amounts of computational resources. To address these issues, we increase the training efficiency of data2vec, a learning objective that generalizes across several modalities. We do not encode masked tokens, use a fast convolutional decoder and amortize the effort to build teacher representations. data2vec 2.0 benefits from the rich contextualized target representations introduced in data2vec which enable a fast self-supervised learner. Experiments on ImageNet-1K image classification show that data2vec 2.0 matches the accuracy of Masked Autoencoders in 16.4x lower pre-training time, on Librispeech speech recognition it performs as well as wav2vec 2.0 in 10.6x less time, and on GLUE natural language understanding it matches a retrained RoBERTa model in half the time. Trading some speed for accuracy results in ImageNet-1K top-1 accuracy of 86.8\% with a ViT-L model trained for 150 epochs.

Description

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Links and resources

BibTeX key: baevski2022efficient
entry type: misc
year: 2022
howpublished: https://ai.facebook.com/blog/ai-self-supervised-learning-data2vec/
url: http://arxiv.org/abs/2212.07525
note: cite arxiv:2212.07525

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Comments and Reviews
(0)