
Moving Average Vector Quantization in Speech Coding

Faculty of Electrical and Communications Engineeering, Helsinki University of Technolology, Helsinki, (1999)


This Master's Thesis studies quantization of spectral parameters in speech coding. An all- pole filter is employed to model short-term spectral information within each speech frame. Line spectral frequency (LSF) representation is used for quantization and interpolation of the filter parameters. The properties of speech spectrum and LSF parameters are discussed thorougly. It is shown that the LSF representation has several properties which makes it suitable for quantization. Since speech is quasi-stationary, predictive coding methods can exploit the correlation between LSF parameters of adjacent frames. Thus this thesis focuses on quantizer structures whose performance is improved using moving average predictors. Three predictive structures are introduced. A training algorithm for quantizers using these predictors is also presented. The thesis shows that a quantizer using an inter-split predictor obtains a maximal performance gain of one bit per frame compared to the conventional structures. In addition, this quantizer achieves spectral distortion of 1 dB and outlier percentage of 2 \% at 23 bits per frame. This performance limit is known as transparent quality in literature. The perceptual quality of the quantizers is evaluated with subjective listening tests. The quantizers are installed in an IS-641 speech codec. The tests imply that perceptually transparent quality can be achieved even with a 20-bit quantizer in a noiseless environment. Furthermore, the tests show that in unvoiced segments of speech the spectral parameters can be loosely quantized, whereas voiced segments have to be quantized accurately



