Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.

Z. Du, Q. Chen, S. Zhang, K. Hu, H. Lu, Y. Yang, H. Hu, S. Zheng, Y. Gu, Z. Ma, Z. Gao, and Z. Yan. CoRR, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Zhihao Pan

Zhihao Chen

Zhihao Lao

Zhihao Wu

Zhihao Xu

Other publications of authors with the same name

Investigation of Monaural Front-End Processing for Robust Speech Recognition Without Retraining or Joint-Training.Z. Du, X. Zhang, and J. Han. APSIPA, page 249-254. IEEE, (2019)Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events.H. Song, J. Han, S. Deng, and Z. Du. INTERSPEECH, page 3860-3864. ISCA, (2019)Double Adversarial Network Based Monaural Speech Enhancement for Robust Speech Recognition.Z. Du, J. Han, and X. Zhang. INTERSPEECH, page 309-313. ISCA, (2020)TOLD: a Novel Two-Stage Overlap-Aware Framework for Speaker Diarization.J. Wang, Z. Du, and S. Zhang. ICASSP, page 1-5. IEEE, (2023)IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities.X. Zhang, X. Lyu, Z. Du, Q. Chen, D. Zhang, H. Hu, C. Tan, T. Zhao, Y. Wang, B. Zhang and 3 other author(s). CoRR, (2024)Personality-memory Gated Adaptation: An Efficient Speaker Adaptation for Personalized End-to-end Automatic Speech Recognition.Y. Gu, Z. Du, S. Zhang, J. Han, and Y. He. INTERSPEECH, ISCA, (2024)CASA-ASR: Context-Aware Speaker-Attributed ASR.M. Shi, Z. Du, Q. Chen, F. Yu, Y. Li, S. Zhang, J. Zhang, and L. Dai. INTERSPEECH, page 411-415. ISCA, (2023)M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.F. Yu, S. Zhang, Y. Fu, L. Xie, S. Zheng, Z. Du, W. Huang, P. Guo, Z. Yan, B. Ma and 2 other author(s). CoRR, (2021)Investigation of Monaural Front-End Processing for Robust ASR without Retraining or Joint-Training.Z. Du, X. Zhang, and J. Han. CoRR, (2018)SyncSpeech: Low-Latency and Efficient Dual-Stream Text-to-Speech based on Temporal Masked Transformer.Z. Sheng, Z. Du, S. Zhang, Z. Yan, Y. Yang, and Z. Ling. CoRR, (February 2025)

BibSonomy

Disambiguation of "Du, Zhihao"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.

Please choose a person to relate this publication to

Zhihao Pan

Zhihao Chen

Zhihao Lao

Zhihao Wu

Zhihao Xu

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Du, Zhihao"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.

Please choose a person to relate this publication to

Zhihao Pan

Zhihao Chen

Zhihao Lao

Zhihao Wu

Zhihao Xu

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.