Joint design and trade-off space for composing SSD with EAGLE and token-tree speculation
Investigate the joint design and performance trade-offs of integrating Speculative Speculative Decoding (SSD) with EAGLE-style draft models and token-tree speculative decoding methods. Specifically, determine how to coordinate verification-outcome prediction (fan-out allocation), cache-aware sampling for residual control, and fallback strategies when SSD is composed with EAGLE and tree-based speculation so as to maximize end-to-end speedups while preserving the lossless correctness guarantees across batch sizes and sampling temperatures.
References
Much remains open. SSD composes naturally with EAGLE and token-tree speculation (Appendix\ref{app:combine}); the joint design and tradeoff space is largely unexplored.
— Speculative Speculative Decoding
(2603.03251 - Kumar et al., 3 Mar 2026) in Conclusion and Limitations, Section 6