Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases.

H. Xie, O. Räsänen, K. Drossos, and T. Virtanen. ICASSP, page 8867-8871. IEEE, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Kai Virtanen

Sannakaisa Virtanen

Tuomas Aittomäki

Tuomas Kortelainen

Tuomas Melkas

Other publications of authors with the same name

Multi-label vs. combined single-label sound event detection with deep neural networks.E. Cakir, T. Heittola, H. Huttunen, and T. Virtanen. EUSIPCO, page 2551-2555. IEEE, (2015)Stacked Convolutional and Recurrent Neural Networks for Bird Audio Detection.S. Adavanne, K. Drossos, E. Çakir, and T. Virtanen. CoRR, (2017)Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition.M. Malik, S. Adavanne, K. Drossos, T. Virtanen, D. Ticha, and R. Jarina. CoRR, (2017)Bayesian extensions to non-negative matrix factorisation for audio signal modelling.T. Virtanen, A. Cemgil, and S. Godsill. ICASSP, page 1825-1828. IEEE, (2008)Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion.Z. Wu, T. Virtanen, E. Chng, and H. Li. IEEE ACM Trans. Audio Speech Lang. Process., 22 (10): 1506-1521 (2014)A multi-device dataset for urban acoustic scene classification.A. Mesaros, T. Heittola, and T. Virtanen. DCASE, page 9-13. (2018)Differentiable Tracking-Based Training of Deep Learning Sound Source Localizers.S. Adavanne, A. Politis, and T. Virtanen. WASPAA, page 211-215. IEEE, (2021)A Curated Dataset of Urban Scenes for Audio-Visual Scene Analysis.S. Wang, A. Mesaros, T. Heittola, and T. Virtanen. ICASSP, page 626-630. IEEE, (2021)Non-negative tensor factorization models for Bayesian audio processingU. Şimşekli, T. Virtanen, and A. Cemgil. Digital Signal Processing, (March 2015)Query by Example of Audio Signals using Euclidean Distance Between Gaussian Mixture Models.M. Helén, and T. Virtanen. ICASSP (1), page 225-228. IEEE, (2007)

BibSonomy

Disambiguation of "Virtanen, Tuomas"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases.

Please choose a person to relate this publication to

Kai Virtanen

Sannakaisa Virtanen

Tuomas Aittomäki

Tuomas Kortelainen

Tuomas Melkas

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Virtanen, Tuomas"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases.

Please choose a person to relate this publication to

Kai Virtanen

Sannakaisa Virtanen

Tuomas Aittomäki

Tuomas Kortelainen

Tuomas Melkas

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases.