copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Synthesizing visual speech trajectory with minimum generation error

L. Wang, Y. Wu, X. Zhuang, and F. Soong. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), page 4580-4583. Prague, Czech Republic, (May 2011)
DOI: 10.1109/ICASSP.2011.5947374

Abstract

In this paper, we propose a minimum generation error (MGE) training method to refine the audio-visual HMM to improve visual speech trajectory synthesis. Compared with the traditional maximum likelihood (ML) estimation, the proposed MGE training explicitly optimizes the quality of generated visual speech trajectory, where the audio-visual HMM modeling is jointly refined by using a heuristic method to find the optimal state alignment and a probabilistic descent algorithm to optimize the model parameters under the MGE criterion. In objective evaluation, compared with the ML-based method, the proposed MGE-based method achieves consistent improvement in the mean square error reduction, correlation increase, and recovery of global variance. It also improves the naturalness and audio-visual consistency perceptually in the subjective test.

Links and resources

BibTeX key: Wang2011
entry type: inproceedings
address: Prague, Czech Republic
booktitle: Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
year: 2011
month: may
pages: 4580-4583
owner: schabus
file: :pdfs/wang_icassp_2011.pdf:PDF
DOI: 10.1109/ICASSP.2011.5947374

Cite this publication

@inproceedings{Wang2011, abstract = {In this paper, we propose a minimum generation error (MGE) training method to refine the audio-visual HMM to improve visual speech trajectory synthesis. Compared with the traditional maximum likelihood (ML) estimation, the proposed MGE training explicitly optimizes the quality of generated visual speech trajectory, where the audio-visual HMM modeling is jointly refined by using a heuristic method to find the optimal state alignment and a probabilistic descent algorithm to optimize the model parameters under the MGE criterion. In objective evaluation, compared with the ML-based method, the proposed MGE-based method achieves consistent improvement in the mean square error reduction, correlation increase, and recovery of global variance. It also improves the naturalness and audio-visual consistency perceptually in the subjective test.}, added-at = {2021-02-01T10:51:23.000+0100}, address = {Prague, Czech Republic}, author = {Wang, Lijuan and Wu, Yi-Jian and Zhuang, Xiaodan and Soong, F.K.}, biburl = {https://www.bibsonomy.org/bibtex/2eb518745eaa39f4079c7bd708523e3f7/m-toman}, booktitle = {Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, doi = {10.1109/ICASSP.2011.5947374}, file = {:pdfs/wang_icassp_2011.pdf:PDF}, interhash = {486adda00892403e13b5a862eb1b8bc0}, intrahash = {eb518745eaa39f4079c7bd708523e3f7}, keywords = {HMM;heuristic Markov algorithm;traditional alignment;probabilistic descent error error;photo-real;talking estimation;visual generation head;trajectory-guided;visual hidden likelihood maximum method;mean method;optimal methods;speech models;Speech;Speech models;mean reduction;minimum speech square state synthesis synthesis;Acoustics;Hidden synthesis;Training;Trajectory;Visualization;minimum synthesis;audiovisual training trajectory}, month = may, owner = {schabus}, pages = {4580-4583}, timestamp = {2021-02-01T10:51:23.000+0100}, title = {Synthesizing visual speech trajectory with minimum generation error}, year = 2011 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Synthesizing visual speech trajectory with minimum generation error

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Synthesizing visual speech trajectory with minimum generation error

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Synthesizing visual speech trajectory with minimum generation error

Comments and Reviews
(0)