All Cisco-Network Study Notes: Packetization

Packetization
Before voice may be transmitted over a network, sound has to be captured
from a microphone and digitized. The digital recording is then chopped into
sections (each is typically 20 ms), which are sent sequentially and replayed
in order out a speaker.
Sound is captured at a microphone by sampling (periodically taking a power
reading). The Nyquist theorem says that to reproduce a signal, sampling must
occur at twice the maximum frequency. The phone system is designed to capture
frequencies less than 4 kHz, which are samples of 8,000 times per second.
Pulse Amplitude Modulation (PAM) is used in the PSTN. Samples are quantized
to 8-bit numbers 8,000 times per second (yielding a 64-kbps DS0).
Two forms of quantization are used. A linear scale is used in the U.S., while
abroad, a logarithmic scale is used. The U.S. system (called
μ-law) was developed earlier, and it suffered from lower-powered sampling
systems. A-law (logarithmic sampling) was developed later to be different
and give domestic opportunities to European companies that were still
recovering from World War II. A-law benefits from greater computing
resources, and the logarithmic scale does a better job of reproducing sound.
After being captured, Pulse Amplitude Modulation (PAM) samples are
encoded using a coder/decoder (CODEC). Coders work using two main
techniques: PCM, which encodes the signal straight to bits, and CELP,
which matches the waveform to a predefined library and sends a code.