Cornell University,
arXiv is a free distribution service and an open-access archive for 2,310,555 scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.
@rchiveSIC a vu le jour en Mai 2002. Ce projet a fait l'objet d'une proposition sic_00000025 lors du colloque "Place et enjeux des revues pour la recherche en infoCom (SFSIC)", le 25 mars 2002 et a été retenu en conclusion de ces journées comme une piste favorisant le développement scientifique de notre communauté. Il s'inscrit dans le mouvement mondial des Archives ouvertes, dont le précurseur fut Paul Ginsparg au début des années 90, au "Los Alamos National Laboratory" (http://arxiv.org/). L'initiative de Paul Ginsparg a profondément bousculé l'univers de l'édition en ouvrant la possibilité d'un monde plus différencié de l'édition scientifique et technique.
Today, speech technology is only available for a small fraction of the thousands of languages spoken around the world because traditional systems need to be trained on large amounts of annotated speech audio with transcriptions. Obtaining that kind of data for every human language and dialect is almost impossible.
Wav2vec works around this limitation by requiring little to no transcribed data. The model uses self-supervision to push the boundaries by learning from unlabeled training data. This enables speech recognition systems for many more languages and dialects, such as Kyrgyz and Swahili, which don’t have a lot of transcribed speech audio. Self-supervision is the key to leveraging unannotated data and building better systems.