Author of the publication

Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations.

, and . CVPR Workshops, page 4578-4587. IEEE, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

AXES at TRECVID 2012: KIS, INS, and MED., , , , , , , , , and 9 other author(s). TRECVID, National Institute of Standards and Technology (NIST), (2012)The SpeeD-ZevoTech submission at DISPLACE 2023., , , and . INTERSPEECH, page 3572-3576. ISCA, (2023)YFACC: A Yorùbá Speech-Image Dataset for Cross-Lingual Keyword Localisation Through Visual Grounding., , and . SLT, page 731-738. IEEE, (2022)The INRIA-LIM-VocR and AXES submissions to TrecVid 2014 Multimedia Event Detection., , , , , , , , , and 3 other author(s). TRECVID, National Institute of Standards and Technology (NIST), (2014)Data-Filtering Methods for Self-Training of Automatic Speech Recognition Systems., , , , and . SLT, page 141-147. IEEE, (2021)Revisiting SincNet: An Evaluation of Feature and Network Hyperparameters for Speaker Recognition., , , , and . EUSIPCO, page 1-5. IEEE, (2020)Speaker disentanglement in video-to-speech conversion., , and . EUSIPCO, page 46-50. IEEE, (2021)Robust and efficient models for action recognition and localization. (Modèles robustes et efficaces pour la reconnaissance d'action et leur localisation).. Grenoble Alpes University, France, (2015)Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations., and . CVPR Workshops, page 4578-4587. IEEE, (2022)Visually Grounded Speech Models have a Mutual Exclusivity Bias., , , and . CoRR, (2024)