Papers
Topics
Authors
Recent
Search
2000 character limit reached

WaveNet-VNN: Causal Active Noise Control

Updated 3 February 2026
  • The paper presents a hybrid model that integrates causal dilated convolutions with explicit quadratic VNN blocks to directly model nonlinear ANC responses.
  • It employs a strict causal design with a 1024-sample receptive field and sample-wise streaming, ensuring zero algorithmic delay for real-time deployment.
  • Benchmarking shows that WaveNet-VNN outperforms traditional Wiener and adaptive filters in NMSE and dBA metrics across various noise conditions.

WaveNet-VNN refers to a fully causal, end-to-end neural framework for Active Noise Control (ANC) that integrates a dilated-convolutional WaveNet backbone with explicit Volterra Neural Network (VNN) blocks. This hybrid approach directly models nonlinear system responses, particularly those arising in loudspeaker-based ANC, while ensuring strict causal operation necessary for real-time deployment. The architecture and its benchmarking are detailed in "WaveNet-Volterra Neural Networks for Active Noise Control: A Fully Causal Approach" (Bai et al., 6 Apr 2025).

1. Architecture and Computational Structure

The WaveNet-VNN model combines deep causal temporal modeling with explicit quadratic nonlinearity through the following architectural elements:

  • Causal Dilated-Convolutional WaveNet Backbone:
    • Layers: 30 residual blocks, each with dilation factors doubling per layer (1, 2, 4, …, 512).
    • Kernel Size: 2, so each convolution operates on two consecutive time-steps.
    • Channels: In each block, the hidden state is split into filter and gate branches (e.g., 64 channels each).
    • Receptive Field: R=1+=029d(k1)=1024R = 1 + \sum_{\ell=0}^{29} d_{\ell} \cdot (k-1) = 1024 samples, strictly causal.
    • Residual and Skip Connections: Present at each block to facilitate gradient flow over long temporal contexts.
  • Volterra Neural Network (VNN) Blocks:
    • Positioned after the WaveNet trunk.
    • First-Order: A linear causal convolution of length K1K_1.
    • Second-Order: Four sets of pairwise products from causal convolutions of length K2K_2, forming the explicit quadratic terms.
    • Block Pipeline:

    causal input conv[30×WaveNet block][4×tanh][3×causal conv]VNN blockoutput\text{causal input conv} \rightarrow [30 \times \text{WaveNet block}] \rightarrow [4 \times \tanh] \rightarrow [3 \times \text{causal conv}] \rightarrow \text{VNN block} \rightarrow \text{output}

This synergy enables the mapping of linear and quadratic system behaviors without introducing any noncausal look-ahead.

2. Mathematical Formalism

  • Second-Order Volterra Series Expansion:

y[n]=k=0K11h1[k]x[nk]+k1=0K21k2=0K21h2[k1,k2]x[nk1]x[nk2]+O(x3)y[n] = \sum_{k=0}^{K_1-1} h_1[k] \cdot x[n-k] + \sum_{k_1=0}^{K_2-1} \sum_{k_2=0}^{K_2-1} h_2[k_1,k_2] \cdot x[n-k_1] x[n-k_2] + \mathcal{O}(x^3)

  • Causal Dilated Convolution (layer \ell with dilation dd_\ell):

(Wdx)[n]=τ=0k1W[τ]x[ndτ](W *_{d_\ell} x)[n] = \sum_{\tau=0}^{k-1} W[\tau] \cdot x[n - d_\ell \tau]

  • Gated Activation (WaveNet block \ell):

z[n]=tanh(Wf,dx[n])σ(Wg,dx[n])z_\ell[n] = \tanh(W_{f,\ell} *_{d_\ell} x[n]) \odot \sigma(W_{g,\ell} *_{d_\ell} x[n])

  • Combined Output Mapping:

y^[n]=FWaveNet(xnR+1n;ΘW)+FVNN(xnK2+1n;ΘV)\hat{y}[n] = F_\mathrm{WaveNet}(x_{n-R+1\dots n}; \Theta_W) + F_\mathrm{VNN}(x_{n-K_2+1\dots n}; \Theta_V)

  • Loss Function:

L(Θ)=12NMSE(e,d)+12LdBA(e)+λΘ22L(\Theta) = \frac{1}{2}\,\mathrm{NMSE}(e,d) + \frac{1}{2}\,L_\mathrm{dBA}(e) + \lambda ||\Theta||_2^2

with

NMSE=10log10ne[n]2nd[n]2LdBA=10log10nw[n]e[n]2pref2\mathrm{NMSE} = 10\log_{10}\frac{\sum_n e[n]^2}{\sum_n d[n]^2}\quad L_\mathrm{dBA} = 10\log_{10}\frac{\sum_n w[n] e[n]^2 }{p_\mathrm{ref}^2}

where λ\lambda is a small weight-decay coefficient.

3. Causality, Latency, and Real-Time Operation

WaveNet-VNN is designed for strict causality:

  • Strictly Causal Convolutions: All computations are one-sided; only x[nτn]x[n-\tau \leq n] are used.

  • No Look-ahead: Gated activations, residuals, and skip connections do not introduce future context.

  • Algorithmic Delay: Zero, except for the buffer required for receptive field (R1R-1 samples).

  • Efficient Real-Time Inference: A circular buffer retains the last 1024 samples for each convolution. The model is updated sample-by-sample; optimized for per-sample streaming.

  • Hardware Considerations: Deployable on modern CPUs and DSPs; supports mixed-precision or int8 quantization for further speedup under real-time constraints.

4. Training Procedures and System Setup

  • Data:

    • Training: 26 hours from DEMAND and MS-SNSD, segmented into 3s clips, 16kHz sampling, normalized.
    • Evaluation: Factory1, babble, engine, factory2 (NOISEX-92), entirely unseen during training.
  • Loss Metric: Mean of NMSE (in dB) and A-weighted level (dBA).
  • Optimizer: Adam with β1=0.9\beta_1 = 0.9, β2=0.999\beta_2 = 0.999, learning rate η=1×103\eta = 1 \times 10^{-3}, weight decay λ=106\lambda = 10^{-6}.
  • Batch: 16 segments of three seconds each per update, 30 epochs.
  • Nonlinearity in Plant Modeling: Scaled Error Function (SEF)

fSEF(y)=0yexp(x2/(2η2))dx,η2{,0.5,0.1}f_\mathrm{SEF}(y) = \int_0^y \exp\left(-x^2 / (2\eta^2)\right) dx,\quad \eta^2 \in \{\infty,\,0.5,\,0.1\}

5. Benchmarking, Baselines, and Results

A rigorous evaluation compares WaveNet-VNN against both traditional adaptive and deep learning ANC baselines:

Baseline NMSE (dB) / dBA (Babble, η2=\eta^2 = \infty) NMSE (dB) / dBA (Factory1, η2=0.1\eta^2 = 0.1)
Wiener (2048 taps) –36.00 / –30.79 –12.22 / –12.20
WaveNet–VNN –41.04 / –34.48 –27.94 / –26.38
  • Average Gain (across noises/η2\eta^2): +3.85 dBA and +5.91 dB (NMSE) over Wiener-2048.
  • Key Baselines: TD-FxLMS, THF-FxLMS, FD-FxNLMS, ODW-FDFeLMS, Wiener (512 & 2048), SPD-ANC, CRN.
  • Lessons: Earlier DNN-based ANC claims overstate superiority when compared only to low-order or simplified adaptive baselines. Fully optimized, high-order Wiener or adaptive filters remain strong, but the hybrid WaveNet–VNN architecture delivers state-of-the-art ANC performance, especially under strong plant nonlinearity.

6. Deployment and Extensions

  • Deployment Characteristics: Fully causal, zero algorithmic delay, sample-wise streaming implementation.
  • Reference Implementation: Public codebase and pretrained models are available for reproducibility and further work (https://github.com/Lu-Baihh/WaveNet-VNNs-for-ANC.git).
  • Practical Extensions: Multi-channel ANC, experiments in real acoustic environments, model compression and quantization for resource-constrained embedded devices are suggested avenues.
  • Significance: The integration of deep dilated convolutions with explicit Volterra nonlinear mappings enables robust ANC under difficult nonlinearities without sacrificing real-time guarantees or introducing latency (Bai et al., 6 Apr 2025).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to WaveNet-VNN.