Abstract
This Master's Thesis studies quantization of spectral parameters in speech coding. An all-
pole filter is employed to model short-term spectral information within each speech frame.
Line spectral frequency (LSF) representation is used for quantization and interpolation of
the filter parameters. The properties of speech spectrum and LSF parameters are discussed
thorougly. It is shown that the LSF representation has several properties which makes it
suitable for quantization.
Since speech is quasi-stationary, predictive coding methods can exploit the correlation
between LSF parameters of adjacent frames. Thus this thesis focuses on quantizer
structures whose performance is improved using moving average predictors. Three
predictive structures are introduced. A training algorithm for quantizers using these
predictors is also presented. The thesis shows that a quantizer using an inter-split predictor
obtains a maximal performance gain of one bit per frame compared to the conventional
structures. In addition, this quantizer achieves spectral distortion of 1 dB and outlier
percentage of 2 \% at 23 bits per frame. This performance limit is known as transparent
quality in literature.
The perceptual quality of the quantizers is evaluated with subjective listening tests. The
quantizers are installed in an IS-641 speech codec. The tests imply that perceptually
transparent quality can be achieved even with a 20-bit quantizer in a noiseless
environment. Furthermore, the tests show that in unvoiced segments of speech the spectral
parameters can be loosely quantized, whereas voiced segments have to be quantized
accurately
Users
Please
log in to take part in the discussion (add own reviews or comments).