copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis

J. Yamagishi, T. Nose, H. Zen, Z. Ling, T. Toda, K. Tokuda, S. King, and S. Renals. IEEE Transactions on Audio, Speech, and Language Processing, 17 (6): 1208-1230 (August 2009)
DOI: 10.1109/TASL.2009.2016394

Abstract

This paper describes a speaker-adaptive HMM-based speech synthesis system. The new system, called ldquoHTS-2007,rdquo employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available. In addition, a comparison study with several speech synthesis techniques shows the new system is very robust: It is able to build voices from less-than-ideal speech data and synthesize good-quality speech even for out-of-domain sentences.

Links and resources

BibTeX key: Yamagishi2009a
entry type: article
year: 2009
month: Aug
journal: IEEE Transactions on Audio, Speech, and Language Processing
number: 6
pages: 1208-1230
volume: 17
owner: schabus
file: :pdfs/yamagishi_transasp_2009.pdf:PDF
issn: 1558-7916
DOI: 10.1109/TASL.2009.2016394

Cite this publication

@article{Yamagishi2009a, abstract = {This paper describes a speaker-adaptive HMM-based speech synthesis system. The new system, called ldquoHTS-2007,rdquo employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available. In addition, a comparison study with several speech synthesis techniques shows the new system is very robust: It is able to build voices from less-than-ideal speech data and synthesize good-quality speech even for out-of-domain sentences.}, added-at = {2021-02-01T10:51:23.000+0100}, author = {Yamagishi, Junichi and Nose, Takashi and Zen, Heiga and Ling, Zhen-Hua and Toda, Tomoki and Tokuda, Keiichi and King, Simon and Renals, Steve}, biburl = {https://www.bibsonomy.org/bibtex/29c1acaffc4254bf751efdb56e13e3867/m-toman}, doi = {10.1109/TASL.2009.2016394}, file = {:pdfs/yamagishi_transasp_2009.pdf:PDF}, interhash = {7987f8368ebd9e2c2db975036f36fd33}, intrahash = {9c1acaffc4254bf751efdb56e13e3867}, issn = {1558-7916}, journal = {IEEE Transactions on Audio, Speech, and Language Processing}, keywords = {(artificial HTS;HMM-based Markov Speech Synthesis System, adaptation;speech analysis;Speech analysis;full-covariance approach;text-to-speech conversion hidden intelligence);speech model;speaker-dependent modeling;mixed-gender modeling;robust models;High models;learning reactor;Councils;Hidden science;Continuous-stirred science;Nose;Robustness;Speech speaker-adaptive speech superconductors;Information synthesis;Average synthesis;Computer synthesis;speaker synthesis;text synthesis;voice tank temperature voice;HMM}, month = Aug, number = 6, owner = {schabus}, pages = {1208-1230}, timestamp = {2021-02-01T10:51:23.000+0100}, title = {Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis}, volume = 17, year = 2009 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis

Comments and Reviews
(0)