A 2.4 kbps Waveform Interpolation Speech Codec Incorporating Wavelet-Based Techniques

F.C.A.Brooks, L. Hanzo

Department of Electronics and Computer Science,
University of Southampton, SO17 1BJ, UK.
Tel: +44 703 593125, Fax: +44 703 593045
Email: lh@ecs.soton.ac.uk

© 1998 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to refuse any copyrighted component of this work in other works must be obtained from the IEEE.

Submitted to IEEE Trans. on Audio and Speech Processing, 1998.

Abstract:

Following a brief portrayal of the activities in 2.4 kbps speech coding a wavelet-based pitch detector is invoked, which substantially reduces the complexity of conventional autocorrelation-based pitch detectors, while ensuring smooth pitch trajectory evolution. This scheme is incorporated in a waveform-interpolated codec, which uses voiced-unvoiced classification and instead of simple Dirac-pulses an unconventional Zinc basis function excitation is employed for modelling the voiced excitation. The required Zinc-function parameters are determined in an analysis-by-synthesis loop and for the sake of smooth waveform evolution and reduced complexity, a focused search strategy and a few further sub-optimum restrictions are imposed without seriously affecting the speech quality. This base-line codec operates at a rate of 1.9 kbps, but it suffers form slight buzziness during the periods of excessive voicing. This impediment is then mitigated by invoking a mixed voiced-unvoiced multi-band excitation, which slightly increases the bit rate to 2.4 kbps due to the transmission of the 2-bit voicing strength code in each of the 5 excitation bands.

Download Postscript...


Last changed: $Date: 1998/03/19 11:18:39 $