The CutCircuit class - Walkthrough

This walkthrough traces the path through the key methods of CutCircuit, and explains the internal attributes and data structures as they are populated and consumed. It follows the example below.

from cutqc2.library.sample import sample_circuit
from cutqc2.core.cut_circuit import CutCircuit


if __name__ == "__main__":
    cut_circuit = CutCircuit(sample_circuit)

    cut_circuit.cut(
        max_subcircuit_width=3,  # Max qubits per subcircuit
        max_cuts=1,  # Max total cuts in the circuit
        num_subcircuits=[2],  # Desired number of subcircuits to try
    )
    print(cut_circuit)

    for subcircuit in cut_circuit:
        print(subcircuit)
        print()

    cut_circuit.run_subcircuits()
    
    cut_circuit.to_file("tutorial.zarr")
    cut_circuit = CutCircuit.from_file("tutorial.zarr")

    cut_circuit.postprocess(capacity=3, max_recursion=9)

    probabilities = cut_circuit.get_probabilities()

    error = cut_circuit.verify(probabilities, raise_error=False)
    print(f"Verification error: {error}")

    cut_circuit.plot(plot_ground_truth=True)

1) Initialization

Constructor: CutCircuit(circuit | circuit_qasm3)

  • Inputs:

    • circuit: a qiskit.QuantumCircuit; if None, a qasm3 str representation of the circuit using the the circuit_qasm3 argument must be provided.

  • cut_circuit = CutCircuit(circuit) validates the circuit (single component, no classical bits, no barriers, ≤ 2-qubit ops). and initializes the following attributes:

    • self.circuit: a copy of the input circuit.

    • self.circuit_with_cut_gates: the input circuit with placeholder cut gates, useful for visualization. This is initially a blank QuantumCircuit with the same quantum registers, and is later populated when cut() is called.

    • self.num_cuts: 0 (number of cuts inserted - initially zero)

    • self.subcircuits: [] (list of QuantumCircuit subcircuits - initially empty)

    • self.subcircuit_dagedges: [] (list of lists of DAGEdge that define subcircuits). DAGEdge objects are abstractions that represent a two-qubit gate and its associated wire and gate indices.

    • self.complete_path_map: {} (wire_index → [(subcircuit_i, subcircuit_wire_index), …]). This maps each original wire to the sequence of subcircuits and local qubits it traverses.

    • self.dynamic_definition: None. This will later hold a DynamicDefinition object that manages the recursive reconstruction process.

Some other internal attributes that are initialized are:

  • self.inter_wire_dag: a DAG containing only two-qubit gates (for cut search).

  • self.inter_wire_dag_metadata: metadata for self.inter_wire_dag:

    • n_vertices: number of two-qubit vertices

    • edges: edges among those vertices

    • id_to_dag_edge: mapping from integer vertex ids to DAGEdge objects.

2) Cutting the circuit

Method: cut(max_subcircuit_width, max_cuts, num_subcircuits, subcircuits=None)

  • This step calls find_cuts() to search a feasible partition:

    • Internally, it uses self.inter_wire_dag_metadata and the MIPCutSearcher class, which is a MIP solver to find a partition of the two-qubit gates into subcircuits that respect certain constraints.

    • Returns list[list[DAGEdge]] where each inner list is the list of DAGEdge objects (two-qubit gates) assigned to that subcircuit.

  • Once the gates are assigned to subcircuits, the cut() method:

    • Traverses the original circuit topologically and constructs, for each subcircuit:

      • subcircuit_instructions[subcircuit_i]: list of Instructions mapped to that subcircuit and local wire indices.

      • When the active subcircuit on a wire changes, inserts a WireCutGate() into self.circuit_with_cut_gates and increments self.num_cuts.

    • Once parsed, it builds actual QuantumCircuit objects for each subcircuit from subcircuit_instructions.

    • complete_path_map[wire_index] [(subcircuit_i, subcircuit_wire_index), ...]: the path a qubit takes across subcircuits.

  • Towards the end of cut():

    • populate_compute_graph():

      • This step creates self.compute_graph (nodes and edges):

        • For each subcircuit j: node attributes include

          • effective: the number of effective qubits (reduced by outgoing cuts)

          • rho and O counts (incoming/outgoing cut roles)

        • For each wire path in complete_path_map, adds edges between adjacent (upstream, downstream) subcircuits with attributes:

          • O_qubit: local outgoing qubit index in upstream subcircuit

          • rho_qubit: local incoming qubit index in downstream subcircuit

    • populate_subcircuit_entries():

      • For each subcircuit node, this step enumerates incident edges and I/X/Y/Z choices, and builds:

        • self.subcircuit_entries[subcircuit_idx]: mapping of (initialization_labels, measurement_labels) to list of coefficient-weighted terms;

        • self.subcircuit_instances[subcircuit_idx]: deduplicated list of all (initialization, measurement) tuples needed to run for that subcircuit.

      • Why this matters:

        • This defines what experiments must be run per subcircuit to gather probability vectors needed for reconstruction.

Compute Graph

See the Circuit-Cutting Implementation Details tutorial for more details on the Computation Graph.

Result:

  • self.subcircuits: list of QuantumCircuit subcircuits.

  • self.circuit_with_cut_gates: the original circuit with // markers showing cuts.

  • self.complete_path_map: how original wires traverse subcircuits and local qubits.

What you’ll see:

  • print(cut_circuit) now prints self.circuit_with_cut_gates and shows // where cuts were inserted.

  • Iterating over cut_circuit using for subcircuit in cut_circuit: iterates through each subcircuit, which can then be printed out.

3) Executing subcircuits and packing probabilities

Method: run_subcircuits(subcircuits=None, backend="statevector_simulator")

  • Runs each subcircuit’s instances:

    • Uses utility functions run_subcircuit_instances to execute and attribute_shots to map measured results to entries. This step fills:

      • self.subcircuit_entry_probs[subcircuit_idx][(init, meas)] = 2^k-length probability vector

      • self.subcircuit_packed_probs[subcircuit_idx] = get_packed_probabilities(subcircuit_idx)

    • The method: get_packed_probabilities(subcircuit_i, qubit_spec=None), for each subcircuit:

      • Builds a dense tensor with one axis of length 4 per incident edge (I/X/Y/Z), and a final axis of length 2^k where k is the effective qubits for that subcircuit.

      • If qubit_spec is provided for that subcircuit, applies merge_prob_vector to reduce the last axis. (This functionality is used in compute_probabilities() later).

After run_subcircuits():

  • self.subcircuit_entry_probs and self.subcircuit_packed_probs are populated and ready for the reconstruction phase.

4) Save and load

Method: to_file(filepath | None, ...)

  • Saves the cut circuit and reconstruction artifacts to .zarr. If filepath is None, uses a default scheme including qubit count, cut count, subcircuit count, and a timestamp.

Method: from_file(filepath)

  • Loads a saved CutCircuit from .zarr, rehydrating its state for postprocessing or plotting.

5) Reconstruction setup

When you call postprocess(capacity, max_recursion), where capacity is the max active qubits allowed at any step, and max_recursion is the max depth of recursive refinement:

  • It computes the flow structure across all subcircuits, and maps incoming positions to outgoing positions in a flattened layout used during aggregation.

  • Builds self.dynamic_definition = DynamicDefinition(num_qubits, capacity, prob_fn=self.compute_probabilities)

  • Calls self.dynamic_definition.run(max_recursion=...)

    • This triggers the first call to compute_probabilities(qubit_spec=None) for an all-active (up to the desired capacity) initial qubit spec, and then recursive bin refinement, calling back to compute_probabilities with refined qubit specs in each recursion.

What qubit_spec means here

The term qubit_spec is used throughout the codebase. This is a string over {"A","M","0","1"} across all effective qubits (in an internal, consistent order). "A" keeps a qubit active (resolved), "M" allows merging (marginalizing), and “0”/”1” fixes it to that state.

6) Computing probabilities (MPI-aware core)

Method: compute_probabilities(qubit_spec=None)

  • This method breaks down the problem:

    • Partitions the global qubit_spec into per-subcircuit qubit spec.

    • Builds a local subcircuit_packed_probs dictionary for all subcircuits, applying any merging implied by the per-subcircuit qubit spec.

    • Defines the total “work” as the Cartesian product across all input cut-edge Pauli choices: total_work = n_basis ** sum(self.in_degrees) with n_basis=4.

  • Parallel execution:

    • Rank 0 operates as a scheduler; workers (ranks > 0) receive chunks of initialization tuples and return partial probability vectors.

    • Scheduler (Rank 0):

      • Slices the iterator of work into balanced chunks.

      • Sends [active_qubits, subcircuit_packed_probs, work_chunk] to each worker with MPI_WORK_TAG.

      • Receives partial results with MPI_RESULT_TAG and accumulates them.

      • Sends MPI_DONE_TAG to each worker when no more chunks remain.

    • Worker (Rank > 0):

      • Receives a message; if MPI_DONE_TAG, breaks; otherwise, calls _compute_probabilities(...) on the chunk and replies with the partial result.

_compute_probabilities

Method: _compute_probabilities(active_qubits, subcircuit_packed_probs, initializations_list, ...)

  • This is the core GPU-aware computational kernel.

  • This step computes the partial probability vector for a given list of initialization tuples.

  • For each initialization tuple, it constructs the per-subcircuit index (init_slice, meas_slice, Ellipsis) to pull the probability vector from subcircuit_packed_probs.

  • Finally, it combines subcircuits left-to-right using vector_kron (a Python wrapper over a CUDA kernel) to form a single 2^active_qubits vector.

At the end:

  • Rank 0 divides by 2**self.num_cuts (normalization of the stitched distribution), broadcasts to all ranks, and returns the vector.

Why this design

  • A single call to compute_probabilities can be very expensive; distributing embarrassingly parallel initialization work scales linearly across ranks.

  • Since each rank calls a CUDA kernel, this step can utilize multiple GPUs across nodes.

7) Plotting and verifying probabilities

Method: get_probabilities(full_states=None, quasi=False)

  • If full_states is None, creates all integers [0, ..., 2^n - 1].

  • Computes the reconstruction bit permutation:

    • perm = self.reconstruction_flat_qubit_order()

      • Internally:

        • Builds a flat list (by greedy subcircuit order) of original qubit indices that the downstream reconstruction yields.

        • Converts it into a permutation vector so we can permute bits correctly. The permuted indices are stored as permuted_indices.

  • Calls self.dynamic_definition.probabilities(full_states=permuted_indices) to sum all bins into requested output indices; optionally converts quasi to real probabilities.

Method: verify(probabilities, backend="statevector_simulator", atol=1e-10, raise_error=True)

  • Computes ground truth via evaluate_circ(self.circuit, backend) and returns a normalized MSE error. Raises an Exception (optional) if error is above tolerance.

Method: plot(plot_ground_truth=False, full_states=None, output_file=None)

  • Plots reconstructed probabilities, optionally overlaying ground truth.

Note

Verifying probabilities is only possible for small circuits where the full statevector can be simulated. Plotting is possible for larger circuits, as long as full_states is judiciously selected and not too large.