2. New Functionality: Integration of Annexes for DTX Operation
Discontinued Transmission (DTX) is a strategic function in voice communication systems that conserves bandwidth by reducing the transmission rate during periods of silence. G.729 Annex C+ provides a unified framework for integrating the DTX functionality of Annex B with the different bit rates and operational modes offered by Annex D and Annex E, ensuring efficient and high-quality performance.
2.1 DTX Operation with Annex D
The integration of the Annex B DTX functionality with Annex D, which provides a 6.4 kbit/s coding rate, is a straightforward process. The core components of the DTX system from Annex B—namely the Voice Activity Detection (VAD), Silence Description (SID), and Comfort Noise Generation (CNG)—are reused without modification. The primary implementation detail that requires attention is the correct updating of the postfilter parameters. Specifically, care must be taken to update the parameters for the phase dispersion module during periods of discontinued transmission to ensure a smooth and artifact-free transition back to active speech.
2.2 DTX Operation with Annex E
The integration of Annex B DTX functionality with Annex E, which enables a higher-quality 11.8 kbit/s rate, is slightly more involved. The operational flow is adapted to leverage the more complex analysis available in Annex E. The VAD is performed after the 10th order forward adaptive Linear Predictive Coding (LPC) analysis. Furthermore, the VAD function itself is enhanced, basing its decision on parameters derived from both the 10th order LPC analysis and the backward adaptive LPC analysis.
A critical adaptation is made for frames classified as “non-speech.” In these cases, the LPC mode is forced to forward adaptive, and the backward adaptive LPC analysis is skipped, optimizing processing during silent periods.
To accommodate the expanded application area of Annex E, a new music detection module was introduced. This addition was necessary because, unlike the main G.729 body and Annex B which had no strict performance requirements for music, the expanded application area of Annex E demanded robust quality for such signals. This module guarantees the quality of music segments by ensuring they are coded using the full 11.8 kbit/s rate of Annex E, preventing the VAD from misclassifying music as background noise, which would otherwise lead to significant quality degradation.
This new music detection feature represents a key algorithmic addition within the Annex C+ framework, warranting a more detailed description.