Power-of-two bin aliasing in TurboAngle

Establish whether the non-monotonic perplexity degradations observed at power-of-two angle bin counts (n = 2^k) in TurboAngle arise from algebraic aliasing between the uniform angle quantization grid on the unit circle and the quadrant-like partitions induced by the normalized Fast Walsh–Hadamard Transform’s butterfly structure when preceded by a random ±1 diagonal rotation, thereby causing coherent rather than independent quantization errors.

Background

The paper reports a non-monotonic effect where certain power-of-two angle bin sizes (e.g., n=64) yield worse perplexity than nearby non-power-of-two bin sizes in TinyLlama, despite using TurboAngle’s uniform angle quantization after a random-sign diagonal rotation and FWHT.

The authors conjecture that this is due to an interaction between the quantization grid and the FWHT butterfly structure that aligns bin boundaries with structural quadrants, potentially making quantization errors coherent rather than independent. Proving or refuting this mechanism would clarify when and why power-of-two bin counts should be avoided.

References

We conjecture this arises from algebraic aliasing between the quantization grid and the Hadamard butterfly structure, where $n = 2k$ causes quantization boundaries to align with the quadrant structure produced by butterfly stages, producing coherent rather than independent errors.

TurboAngle: Near-Lossless KV Cache Compression via Uniform Angle Quantization  (2603.27467 - Patel, 29 Mar 2026) in Section 6, Non-Monotone Behavior