Article,

Automatic segmentation and labeling of speech based on Hidden Markov Models

, , and .
Speech Communication, 12 (4): 357-370 (August 1993)
DOI: 10.1016/0167-6393(93)90083-W

Abstract

An accurate database documentation at phonetic level is very important for speech research: however, manual segmentation and labeling is a time consuming and error prone task. This article describes an automatic procedure for the segmentation of speech: given either the linguistic or the phonetic content of a speech utterance, the system provides phone boundaries. The technique is based on the use of an acoustic-phonetic unit Hidden Markov Model (HMM) recognizer: both the recognizer and the segmentation system have been designed exploiting the DARPA-TIMIT acoustic-phonetic continuous speech database of American English. Segmentation and labeling experiments have been conducted in different conditions to check the reliability of the resulting system. Satisfactory results have been obtained, especially when the system is trained with some manually presegmented material. The size of this material is a crucial factor; system performance has been evaluated with respect to this parameter. It turns out that the system provides 88.3\% correct boundary location, given a tolerance of 20 ms, when only 256 phonetically balanced sentences are used for its training.

Tags

Users

  • @grlogic
  • @m-toman
  • @dblp

Comments and Reviews