WaveNet-VNN: Causal Active Noise Control
- The paper presents a hybrid model that integrates causal dilated convolutions with explicit quadratic VNN blocks to directly model nonlinear ANC responses.
- It employs a strict causal design with a 1024-sample receptive field and sample-wise streaming, ensuring zero algorithmic delay for real-time deployment.
- Benchmarking shows that WaveNet-VNN outperforms traditional Wiener and adaptive filters in NMSE and dBA metrics across various noise conditions.
WaveNet-VNN refers to a fully causal, end-to-end neural framework for Active Noise Control (ANC) that integrates a dilated-convolutional WaveNet backbone with explicit Volterra Neural Network (VNN) blocks. This hybrid approach directly models nonlinear system responses, particularly those arising in loudspeaker-based ANC, while ensuring strict causal operation necessary for real-time deployment. The architecture and its benchmarking are detailed in "WaveNet-Volterra Neural Networks for Active Noise Control: A Fully Causal Approach" (Bai et al., 6 Apr 2025).
1. Architecture and Computational Structure
The WaveNet-VNN model combines deep causal temporal modeling with explicit quadratic nonlinearity through the following architectural elements:
- Causal Dilated-Convolutional WaveNet Backbone:
- Layers: 30 residual blocks, each with dilation factors doubling per layer (1, 2, 4, …, 512).
- Kernel Size: 2, so each convolution operates on two consecutive time-steps.
- Channels: In each block, the hidden state is split into filter and gate branches (e.g., 64 channels each).
- Receptive Field: samples, strictly causal.
- Residual and Skip Connections: Present at each block to facilitate gradient flow over long temporal contexts.
- Volterra Neural Network (VNN) Blocks:
- Positioned after the WaveNet trunk.
- First-Order: A linear causal convolution of length .
- Second-Order: Four sets of pairwise products from causal convolutions of length , forming the explicit quadratic terms.
- Block Pipeline:
This synergy enables the mapping of linear and quadratic system behaviors without introducing any noncausal look-ahead.
2. Mathematical Formalism
- Second-Order Volterra Series Expansion:
- Causal Dilated Convolution (layer with dilation ):
- Gated Activation (WaveNet block ):
- Combined Output Mapping:
- Loss Function:
with
where is a small weight-decay coefficient.
3. Causality, Latency, and Real-Time Operation
WaveNet-VNN is designed for strict causality:
Strictly Causal Convolutions: All computations are one-sided; only are used.
No Look-ahead: Gated activations, residuals, and skip connections do not introduce future context.
Algorithmic Delay: Zero, except for the buffer required for receptive field ( samples).
Efficient Real-Time Inference: A circular buffer retains the last 1024 samples for each convolution. The model is updated sample-by-sample; optimized for per-sample streaming.
Hardware Considerations: Deployable on modern CPUs and DSPs; supports mixed-precision or int8 quantization for further speedup under real-time constraints.
4. Training Procedures and System Setup
Data:
- Training: 26 hours from DEMAND and MS-SNSD, segmented into 3s clips, 16kHz sampling, normalized.
- Evaluation: Factory1, babble, engine, factory2 (NOISEX-92), entirely unseen during training.
- Loss Metric: Mean of NMSE (in dB) and A-weighted level (dBA).
- Optimizer: Adam with , , learning rate , weight decay .
- Batch: 16 segments of three seconds each per update, 30 epochs.
- Nonlinearity in Plant Modeling: Scaled Error Function (SEF)
5. Benchmarking, Baselines, and Results
A rigorous evaluation compares WaveNet-VNN against both traditional adaptive and deep learning ANC baselines:
| Baseline | NMSE (dB) / dBA (Babble, ) | NMSE (dB) / dBA (Factory1, ) |
|---|---|---|
| Wiener (2048 taps) | –36.00 / –30.79 | –12.22 / –12.20 |
| WaveNet–VNN | –41.04 / –34.48 | –27.94 / –26.38 |
- Average Gain (across noises/): +3.85 dBA and +5.91 dB (NMSE) over Wiener-2048.
- Key Baselines: TD-FxLMS, THF-FxLMS, FD-FxNLMS, ODW-FDFeLMS, Wiener (512 & 2048), SPD-ANC, CRN.
- Lessons: Earlier DNN-based ANC claims overstate superiority when compared only to low-order or simplified adaptive baselines. Fully optimized, high-order Wiener or adaptive filters remain strong, but the hybrid WaveNet–VNN architecture delivers state-of-the-art ANC performance, especially under strong plant nonlinearity.
6. Deployment and Extensions
- Deployment Characteristics: Fully causal, zero algorithmic delay, sample-wise streaming implementation.
- Reference Implementation: Public codebase and pretrained models are available for reproducibility and further work (https://github.com/Lu-Baihh/WaveNet-VNNs-for-ANC.git).
- Practical Extensions: Multi-channel ANC, experiments in real acoustic environments, model compression and quantization for resource-constrained embedded devices are suggested avenues.
- Significance: The integration of deep dilated convolutions with explicit Volterra nonlinear mappings enables robust ANC under difficult nonlinearities without sacrificing real-time guarantees or introducing latency (Bai et al., 6 Apr 2025).