- The paper presents Qdislib, an open-source library that uses graph-based methods to partition large quantum circuits into manageable subcircuits.
- It demonstrates near-linear speedups with increasing nodes by efficiently executing Hardware Efficient Ansatz circuits across CPUs, GPUs, and QPUs.
- The integration with PyCOMPSs and the FindCut algorithm enables a scalable, hybrid quantum-classical workflow with minimized computational overhead.
This paper presents the development and evaluation of Qdislib, an open-source library designed to facilitate distributed quantum circuit cutting, integrated with hybrid quantum-classical high-performance computing systems. The library aims to address current quantum hardware limitations by efficiently partitioning large quantum circuits into smaller subcircuits, compatible with modern quantum platforms. It leverages a graph-based representation for scalable manipulation and execution across multiple computing resources, including CPUs, GPUs, and quantum processing units (QPUs).
Quantum Computing Fundamentals
Quantum computing hinges on qubits to perform computations, with qubits benefiting from properties such as superposition and entanglement. The manipulation of qubits occurs within quantum circuits, employing unitary operations represented by quantum gates. Simulation of quantum systems using classical computers is computationally intensive since it requires handling exponentially large matrices corresponding to qubit states [nielsen2000quantum]. This challenge is significant in achieving quantum advantage, necessitating sophisticated simulation techniques.
Quantum Circuit Cutting
Circuit cutting, particularly wire and gate cutting, is a key technique for scaling quantum computations within limited hardware resources. Wire cutting involves splitting a circuit by cutting a qubit wire, partitioning the circuit into independent subcircuits (Figure 1). Gate cutting decomposes entangling gates into local operations, requiring additional measurements to reconstruct results (Figure 2) [mitarai2021constructing]. The scalability of both methods is constrained by exponential growth in subcircuit count relative to cut number, highlighting computational overhead considerations.
Figure 1: Example of a quantum circuit partitioned.
Figure 2: Decomposition for cutting a Controlled-Z gate. Six subcircuits are required to reconstruct this gate.
Qdislib Overview
Qdislib employs a graph-based approach to represent quantum circuits as Directed Acyclic Graphs (DAGs), facilitating transformation and execution across various computing environments, including both simulation and real quantum hardware. It implements wire and gate cutting techniques through a streamlined workflow (Figure 3), and integrates with PyCOMPSs for parallel execution across CPUs, GPUs, and QPUs [tejedor2017pycompss].
Figure 3: Circuit Cutting Qdislib workflow.
FindCut Algorithm
The FindCut algorithm automates partitioning by identifying optimal cutting sites based on user-defined constraints, such as qubit count per subcircuit (Figure 4). It employs well-known graph partitioning techniques like Kernighan-Lin and METIS, considering minimizing cuts, maximizing components, and minimizing qubits in subcircuits.
Figure 4: FindCut workflow in Qdislib.
Evaluation
Qdislib's performance was evaluated using Hardware Efficient Ansatz (HEA) circuits, demonstrating scalability across distributed CPU and GPU environments. The library showed speedups close to linear when increasing node counts and proved more efficient for simulating larger circuits due to optimal scaling with subcircuit size (Figures 5-10). A hybrid execution paradigm combining CPU simulations with cloud-based QPU executions further illustrated the efficient orchestration of heterogeneous computing resources.
Figure 5: Hybrid execution schema.
Figure 6: Execution time for HEA circuits with 4 cuts.
Figure 7: Speedup for the 96-qubit HEA circuit with 4 cuts.
Figure 8: Execution time for HEA circuits varying depth.
Figure 9: Execution time for HEA circuits varying cuts.
Figure 10: Execution times for 64-qubit and 96-qubit circuits with 4 cuts on CPU (112 cores) versus GPU (4 gpus).
Previous approaches, such as FitCut, have similarly tackled scalable quantum circuit partitioning, but with static scheduling algorithms, unlike the dynamic scheduling enabled by PyCOMPSs. Other methods, including CutQC, focus on parallel reconstruction phases [kan2024scalable] [tang2021cutqc]. Qdislib stands out with its integration of varied computing platforms and flexible graph-based approach.
Conclusions
Qdislib enhances quantum circuit scalability, enabling hybrid quantum-classical executions through efficient circuit cutting and orchestration. The library's ability to seamlessly manage distributed resources underscores its potential for broader applications in the quantum computing domain [Web:COMPSs]. As quantum hardware evolves, leveraging such scalable frameworks will be crucial for extending quantum advantage and supporting quantum algorithm development. Future directions include expanding the library's compatibility with additional quantum computing tools and refining cutting protocols for reduced computational overhead.