The 4.0 kbit/s speech codec described in this paper is based on a
Frequency Domain Interpolative (FDI) coding technique, which
belongs to the class of prototype waveform Interpolation (PWI)
coding techniques. The codec also has an integrated voice
activity detector (VAD) and a noise reduction capability. The
input signal is subjected to LPC analysis and the prediction
residual is separated into a slowly evolving waveform (SEW) and
a rapidly evolving waveform (REW) components. The SEW
magnitude component is quantized using a hierarchical
predictive vector quantization approach. The REW magnitude is
quantized using a gain and a sub-band based SHape. SEW and
REW phases are derived at the decoder using a phase model,
based on a transmitted measure of voice periodicity. The spectral
(LSP) parameters are quantized using a combination of scalar
and vector quantizers. The 4.0 kbits/s coder has an algorithmic
delay of 60 ms and an estimated floating point complexity of
21.5 MIPS. The performance of this coder has been evaluated
using in-house MOS tests under various conditions such as
background noise. channel errors, self-tandem. and DTX mode
of operation, and has been shown to be statistically equivalent to
ITU-T (3.729 8 kbps codec across all conditions tested.
The roots of this book were planted about a decade ago. At that time, I became
increasingly convinced that wide-area and metropolitan-area networks, where much
of my group’s research has been centered at that time, were in good SHape. Although
research in these fields was (and still is) needed, that’s not where the networking
bottleneck seemed to be. Rather, the bottleneck was (and still is in many places) in
the access networks, which choked users’ access to information and services. It was
clear to me that the long-term solution to that problem has to involve optical fiber
access networks.