Multi-qubit Toffoli with exponentially fewer T gates
Published 8 Oct 2025 in quant-ph | (2510.07223v1)
Abstract: Prior work of Beverland et al. has shown that any exact Clifford+$T$ implementation of the $n$-qubit Toffoli gate must use at least $n$ $T$ gates. Here we show how to get away with exponentially fewer $T$ gates, at the cost of incurring a tiny $1/\mathrm{poly}(n)$ error that can be neglected in most practical situations. More precisely, the $n$-qubit Toffoli gate can be implemented to within error $\epsilon$ in the diamond distance by a randomly chosen Clifford+$T$ circuit with at most $O(\log(1/\epsilon))$ $T$ gates. We also give a matching $\Omega(\log(1/\epsilon))$ lower bound that establishes optimality, and we show that any purely unitary implementation achieving even constant error must use $\Omega(n)$ $T$ gates. We also extend our sampling technique to implement other Boolean functions. Finally, we describe upper and lower bounds on the $T$-count of Boolean functions in terms of non-adaptive parity decision tree complexity and its randomized analogue.
The paper demonstrates that approximate n-qubit Toffoli gates can be implemented with O(log(1/ϵ)) T gates, drastically reducing resource requirements compared to Ω(n) in exact circuits.
It leverages randomized fingerprinting and mixed Clifford+T models to achieve optimality, and generalizes techniques to approximating other Boolean functions via their Fourier 1-norm.
The work has significant implications for fault-tolerant quantum compilation, simulation, and resource estimation by dramatically lowering non-Clifford gate overhead.
Multi-qubit Toffoli with Exponentially Fewer T Gates
Overview
This paper establishes that the n-qubit Toffoli gate, a central primitive in quantum algorithms, can be approximated to within error ϵ in diamond distance using only O(log(1/ϵ))T gates in the mixed Clifford+T circuit model. This is a dramatic reduction from the previously established lower bound of Ω(n)T gates required for exact implementation. The authors provide matching lower bounds, demonstrate optimality, and generalize their techniques to other Boolean functions, relating T-count to Fourier 1-norm and non-adaptive parity decision tree complexity.
Clifford+T Circuit Models and Error Metrics
The work distinguishes three models for Clifford+T circuits:
Unitary Clifford+T circuits: Standard circuits with T-count as the number of T gates used.
Mixed Clifford+T circuits: Probabilistic mixtures over unitary Clifford+T circuits, with the classical compiler sampling the circuit to run.
Adaptive Clifford+T circuits: Circuits with mid-circuit measurements and classical feed-forward.
Approximation is measured in diamond distance, which quantifies the worst-case distinguishability between quantum channels.
Main Result: Approximate Toffoli with Logarithmic T-Count
The n-qubit Toffoli gate, Toffn, can be approximated by a mixed Clifford+T circuit with O(log(1/ϵ))T gates. The construction leverages randomized fingerprinting: the n-bit OR function is approximated by the OR of k random parity functions, each computable by Clifford gates. The only non-Clifford operation is a small ORk gate, implemented as a Toffk+1 gate.
Algorithmic Construction
Sample k=⌈log(1/ϵ)⌉ random subsets S1,...,Sk⊆[n−1].
The resulting mixed circuit is ϵ-close to Toffn in diamond distance.
This approach is advantageous for n≫log(1/ϵ), as the T-count becomes independent of n.
Lower Bounds and Optimality
Any unitary Clifford+T circuit (even with constant error) requires Ω(n)T gates for Toffn.
Any mixed or adaptive Clifford+T circuit requires Ω(min{n,log(1/ϵ)})T gates, matching the upper bound.
Generalization to Boolean Functions
The technique generalizes to any Boolean function f with small Fourier 1-norm. The T-count to approximate Uf (the unitary that computes f reversibly) is O(∥f∥12log(1/ϵ)), where ∥f∥1 is the sum of absolute values of the Fourier coefficients of f.
Sampling-Based Approximation
Sample k subsets Si from the distribution p(S)=∣f(S)∣/∥f∥1.
Compute the signed sum of parities, threshold, and reversibly compute the result.
The T-count is O(k), with k=O(∥f∥12log(1/ϵ)).
Parity Decision Tree Complexity and T-Count
The T-count for implementing Uf is tightly related to the non-adaptive parity decision tree complexity (PDT) and its randomized analogue (RPDT):
Tϵ(Uf)≥PDTna(f)−1 for unitary circuits.
Tϵ(Uf)≥RPDTϵ(f)−1 for mixed circuits.
Gate complexity of the decision tree further upper bounds the T-count.
Implications and Applications
Practical Quantum Compilation
For large n, multi-qubit Toffoli gates can be implemented with exponentially fewer T gates, significantly reducing resource requirements in fault-tolerant quantum computation.
The result is directly applicable to Grover's diffusion operator and other gates Clifford-equivalent to Toffoli.
Simulation and Learning
Circuits with large Toffoli gates, previously considered "high magic," can be efficiently simulated or learned when small error is acceptable.
The non-robustness of the Clifford hierarchy to error is highlighted: high-level gates can be approximated by low-level gates in the hierarchy.
Boolean Function Implementation
Functions such as ORn, Hamming weight threshold, codeword membership, and matrix equality can be implemented with O(log(1/ϵ))T gates.
Functions like GTn, ADDn, MAJn, and INCn require Ω(n)T gates, even for constant error.
Theoretical Implications
The separation between exact and approximate T-count for Toffoli is exponential.
The connection between T-count and Fourier 1-norm, as well as parity decision tree complexity, provides new structural insights into quantum circuit synthesis.
The results suggest that error-tolerant quantum algorithms can be compiled with dramatically reduced non-Clifford resources.
Future Directions
Extending lower bounds for T-count in the presence of error for adaptive circuits remains open.
Further exploration of the relationship between communication complexity and T-count may yield new lower bounds for other classes of functions.
Investigation into hardware support for mixed and adaptive circuit models could enable practical deployment of these compilation strategies.
Conclusion
This work demonstrates that approximate implementation of multi-qubit Toffoli gates and other Boolean functions can be achieved with exponentially fewer T gates than previously thought, provided small error is acceptable. The results are optimal within the mixed and adaptive Clifford+T circuit models and have significant implications for quantum algorithm compilation, simulation, and resource estimation. The connection to Fourier analysis and decision tree complexity opens new avenues for both theoretical and practical advances in quantum circuit synthesis.