Runtime overhead of the FWHT encode/decode path
Measure the runtime overhead introduced by TurboAngle’s FWHT-based encode/decode path under realistic batch sizes and sequence lengths to quantify latency and throughput impacts during inference.
References
Runtime overhead of the FWHT encode/decode path has not been measured under realistic batch and sequence sizes.
— TurboAngle: Near-Lossless KV Cache Compression via Uniform Angle Quantization
(2603.27467 - Patel, 29 Mar 2026) in Conclusion — Limitations