Abstract
This paper presents a hidden Markov model (HMM) based unit selection method using hierarchical units under statistical criterion. In our previous work we tried to use frame sized speech segments and maximum likelihood criterion to improve the performance of traditional concatenative synthesis system using phone sized units and cost function criterion. In this paper, hierarchical units which consist of phone level units and frame level units are adopted to achieve better balance between the coverage rate of candidate unit and the number of concatenation points during synthesis. Besides, Kullback-Leibler divergence (KLD) between candidate and target phoneme HMMs is introduced as a part of the final criterion for unit selection. The listening result proves that these two approaches can improve the performance of synthetic speech effectively.
Users
Please
log in to take part in the discussion (add own reviews or comments).