WaveNet-VNN: Causal Active Noise Control

Updated 3 February 2026

The paper presents a hybrid model that integrates causal dilated convolutions with explicit quadratic VNN blocks to directly model nonlinear ANC responses.
It employs a strict causal design with a 1024-sample receptive field and sample-wise streaming, ensuring zero algorithmic delay for real-time deployment.
Benchmarking shows that WaveNet-VNN outperforms traditional Wiener and adaptive filters in NMSE and dBA metrics across various noise conditions.

WaveNet-VNN refers to a fully causal, end-to-end neural framework for Active Noise Control (ANC) that integrates a dilated-convolutional WaveNet backbone with explicit Volterra Neural Network (VNN) blocks. This hybrid approach directly models nonlinear system responses, particularly those arising in loudspeaker-based ANC, while ensuring strict causal operation necessary for real-time deployment. The architecture and its benchmarking are detailed in "WaveNet-Volterra Neural Networks for Active Noise Control: A Fully Causal Approach" (Bai et al., 6 Apr 2025).

1. Architecture and Computational Structure

The WaveNet-VNN model combines deep causal temporal modeling with explicit quadratic nonlinearity through the following architectural elements:

Causal Dilated-Convolutional WaveNet Backbone:
- Layers: 30 residual blocks, each with dilation factors doubling per layer (1, 2, 4, …, 512).
- Kernel Size: 2, so each convolution operates on two consecutive time-steps.
- Channels: In each block, the hidden state is split into filter and gate branches (e.g., 64 channels each).
- Receptive Field: $R = 1 + \sum_{\ell=0}^{29} d_{\ell} \cdot (k-1) = 1024$ samples, strictly causal.
- Residual and Skip Connections: Present at each block to facilitate gradient flow over long temporal contexts.
Volterra Neural Network (VNN) Blocks:
- Positioned after the WaveNet trunk.
- First-Order: A linear causal convolution of length $K_1$ .
- Second-Order: Four sets of pairwise products from causal convolutions of length $K_2$ , forming the explicit quadratic terms.
- Block Pipeline:
$\text{causal input conv} \rightarrow [30 \times \text{WaveNet block}] \rightarrow [4 \times \tanh] \rightarrow [3 \times \text{causal conv}] \rightarrow \text{VNN block} \rightarrow \text{output}$

This synergy enables the mapping of linear and quadratic system behaviors without introducing any noncausal look-ahead.

2. Mathematical Formalism

Second-Order Volterra Series Expansion:

$y[n] = \sum_{k=0}^{K_1-1} h_1[k] \cdot x[n-k] + \sum_{k_1=0}^{K_2-1} \sum_{k_2=0}^{K_2-1} h_2[k_1,k_2] \cdot x[n-k_1] x[n-k_2] + \mathcal{O}(x^3)$

Causal Dilated Convolution (layer $\ell$ with dilation $d_\ell$ ):

$(W *_{d_\ell} x)[n] = \sum_{\tau=0}^{k-1} W[\tau] \cdot x[n - d_\ell \tau]$

Gated Activation (WaveNet block $\ell$ ):

$z_\ell[n] = \tanh(W_{f,\ell} *_{d_\ell} x[n]) \odot \sigma(W_{g,\ell} *_{d_\ell} x[n])$

Combined Output Mapping:

$\hat{y}[n] = F_\mathrm{WaveNet}(x_{n-R+1\dots n}; \Theta_W) + F_\mathrm{VNN}(x_{n-K_2+1\dots n}; \Theta_V)$

Loss Function:

$L(\Theta) = \frac{1}{2}\,\mathrm{NMSE}(e,d) + \frac{1}{2}\,L_\mathrm{dBA}(e) + \lambda ||\Theta||_2^2$

with

$\mathrm{NMSE} = 10\log_{10}\frac{\sum_n e[n]^2}{\sum_n d[n]^2}\quad L_\mathrm{dBA} = 10\log_{10}\frac{\sum_n w[n] e[n]^2 }{p_\mathrm{ref}^2}$

where $\lambda$ is a small weight-decay coefficient.

3. Causality, Latency, and Real-Time Operation

WaveNet-VNN is designed for strict causality:

Strictly Causal Convolutions: All computations are one-sided; only $x[n-\tau \leq n]$ are used.
No Look-ahead: Gated activations, residuals, and skip connections do not introduce future context.
Algorithmic Delay: Zero, except for the buffer required for receptive field ( $R-1$ samples).
Efficient Real-Time Inference: A circular buffer retains the last 1024 samples for each convolution. The model is updated sample-by-sample; optimized for per-sample streaming.
Hardware Considerations: Deployable on modern CPUs and DSPs; supports mixed-precision or int8 quantization for further speedup under real-time constraints.

4. Training Procedures and System Setup

Data:
- Training: 26 hours from DEMAND and MS-SNSD, segmented into 3s clips, 16kHz sampling, normalized.
- Evaluation: Factory1, babble, engine, factory2 (NOISEX-92), entirely unseen during training.
Loss Metric: Mean of NMSE (in dB) and A-weighted level (dBA).
Optimizer: Adam with $\beta_1 = 0.9$ , $\beta_2 = 0.999$ , learning rate $\eta = 1 \times 10^{-3}$ , weight decay $\lambda = 10^{-6}$ .
Batch: 16 segments of three seconds each per update, 30 epochs.
Nonlinearity in Plant Modeling: Scaled Error Function (SEF)

$f_\mathrm{SEF}(y) = \int_0^y \exp\left(-x^2 / (2\eta^2)\right) dx,\quad \eta^2 \in \{\infty,\,0.5,\,0.1\}$

5. Benchmarking, Baselines, and Results

A rigorous evaluation compares WaveNet-VNN against both traditional adaptive and deep learning ANC baselines:

Baseline	NMSE (dB) / dBA (Babble, $\eta^2 = \infty$ )	NMSE (dB) / dBA (Factory1, $\eta^2 = 0.1$ )
Wiener (2048 taps)	–36.00 / –30.79	–12.22 / –12.20
WaveNet–VNN	–41.04 / –34.48	–27.94 / –26.38

Average Gain (across noises/ $\eta^2$ ): +3.85 dBA and +5.91 dB (NMSE) over Wiener-2048.
Key Baselines: TD-FxLMS, THF-FxLMS, FD-FxNLMS, ODW-FDFeLMS, Wiener (512 & 2048), SPD-ANC, CRN.
Lessons: Earlier DNN-based ANC claims overstate superiority when compared only to low-order or simplified adaptive baselines. Fully optimized, high-order Wiener or adaptive filters remain strong, but the hybrid WaveNet–VNN architecture delivers state-of-the-art ANC performance, especially under strong plant nonlinearity.

6. Deployment and Extensions

Deployment Characteristics: Fully causal, zero algorithmic delay, sample-wise streaming implementation.
Reference Implementation: Public codebase and pretrained models are available for reproducibility and further work (https://github.com/Lu-Baihh/WaveNet-VNNs-for-ANC.git).
Practical Extensions: Multi-channel ANC, experiments in real acoustic environments, model compression and quantization for resource-constrained embedded devices are suggested avenues.
Significance: The integration of deep dilated convolutions with explicit Volterra nonlinear mappings enables robust ANC under difficult nonlinearities without sacrificing real-time guarantees or introducing latency (Bai et al., 6 Apr 2025).

Markdown Report Issue Upgrade to Chat

References (1)

WaveNet-Volterra Neural Networks for Active Noise Control: A Fully Causal Approach (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to WaveNet-VNN.