CutQC2 zarr file format

All relevant information during Cutqc2 processing - the original circuit specification, cut locations, subcircuit probability values, and the final reconstructed probabilities - are stored in a single zarr file. A zarr file is a directory containing multiple json files and binary data files, can be stored on disk or in cloud storage, allows for efficient access to subsets of the data without loading everything into memory, as well as parallel read and write operations. It can be accessed from Python, C++ and Rust, among other languages.

See a detailed description of the Zarr format here

Creating the zarr file

If we look at the supremacy_6qubit.sh script in the examples/scripts folder, it has a command to generate and cut a 6 qubit supremacy circuit.

[1]:
!cutqc2 cut \
  --file supremacy_6qubit.qasm3 \
  --max-subcircuit-width 5 \
  --max-subcircuit-cuts 10 \
  --subcircuit-size-imbalance 2 \
  --max-cuts 10 \
  --num-subcircuits 3 \
  --output-file supremacy_6qubit.zarr
(INFO) (base_tasks.py) (18-Sep-25 09:42:17) Pass: UnrollCustomDefinitions - 0.10800 (ms)
(INFO) (base_tasks.py) (18-Sep-25 09:42:17) Pass: BasisTranslator - 0.03314 (ms)
(INFO) (cut_circuit.py) (18-Sep-25 09:42:17) Trying with 3 subcircuits
Set parameter Username
(INFO) (cutter.py) (18-Sep-25 09:42:17) Set parameter Username
Set parameter LicenseID to value 2646086
(INFO) (cutter.py) (18-Sep-25 09:42:17) Set parameter LicenseID to value 2646086
Academic license - for non-commercial use only - expires 2026-04-01
(INFO) (cutter.py) (18-Sep-25 09:42:17) Academic license - for non-commercial use only - expires 2026-04-01
(INFO) (cut_circuit.py) (18-Sep-25 09:42:17) Running subcircuit 0 on backend: statevector_simulator
(INFO) (cut_circuit.py) (18-Sep-25 09:42:17) Running subcircuit 1 on backend: statevector_simulator
(INFO) (cut_circuit.py) (18-Sep-25 09:42:17) Running subcircuit 2 on backend: statevector_simulator
/media/vineetb/delta/projects/cutqc2/.venv/lib/python3.12/site-packages/zarr/core/dtype/npy/structured.py:318: UnstableSpecificationWarning: The data type (Structured(fields=(('subcircuit', Int32(endianness='little')), ('qubit', Int32(endianness='little'))))) does not have a Zarr V3 specification. That means that the representation of arrays saved with this data type may change without warning in a future version of Zarr Python. Arrays stored with this data type may be unreadable by other Zarr libraries. Use this data type at your own risk! Check https://github.com/zarr-developers/zarr-extensions/tree/main/data-types for the status of data type specifications for Zarr V3.
  v3_unstable_dtype_warning(self)
/media/vineetb/delta/projects/cutqc2/.venv/lib/python3.12/site-packages/zarr/core/dtype/npy/string.py:248: UnstableSpecificationWarning: The data type (FixedLengthUTF32(length=20, endianness='little')) does not have a Zarr V3 specification. That means that the representation of arrays saved with this data type may change without warning in a future version of Zarr Python. Arrays stored with this data type may be unreadable by other Zarr libraries. Use this data type at your own risk! Check https://github.com/zarr-developers/zarr-extensions/tree/main/data-types for the status of data type specifications for Zarr V3.
  v3_unstable_dtype_warning(self)

Run the above command to generate the supremacy_6qubit.zarr “file”.

Inspecting the zarr file

In a .zarr file (which is actually a folder), metadata is stored in json files (in zarr.json files) at various levels of the directory structure.

[2]:
!cat supremacy_6qubit.zarr/zarr.json
{
  "attributes": {
    "version": "0.0.7",
    "circuit_qasm": "OPENQASM 3.0;\ninclude \"stdgates.inc\";\nqubit[6] q;\nh q[0];\nh q[1];\nh q[2];\nh q[3];\nh q[4];\nh q[5];\ncz q[0], q[1];\ncz q[4], q[5];\nt q[2];\nt q[3];\ncz q[2], q[4];\nry(pi/2) q[0];\nry(pi/2) q[1];\nry(pi/2) q[5];\nt q[0];\nt q[1];\nt q[5];\nry(pi/2) q[2];\nry(pi/2) q[4];\ncz q[0], q[2];\nt q[4];\ncz q[2], q[3];\nrx(pi/2) q[0];\ncz q[3], q[5];\nt q[0];\nrx(pi/2) q[2];\nt q[2];\nry(pi/2) q[3];\nry(pi/2) q[5];\ncz q[1], q[3];\nt q[5];\nh q[0];\nh q[1];\nh q[2];\nh q[3];\nh q[4];\nh q[5];\n"
  },
  "zarr_format": 3,
  "consolidated_metadata": null,
  "node_type": "group"
}

.zarr files are easily accessed and manipulated in Python, but to keep this discussion language-agnostic, here we will use the jq tool to inspect and explain the contents of the json file. jq is a command line tool commonly available on Linux and MacOS systems, and allows us to extract arbitrary fields from json data.

Top-level metadata

The root zarr.json file contains metadata about the zarr file itself, including the Cutqc2 version number used to create it.

[3]:
!jq -r ".attributes.version" supremacy_6qubit.zarr/zarr.json
0.0.7

It is important to realize that the .zarr file created by one version of Cutqc2 may not be readable by another version of Cutqc2 (at least till we reach the 1.x version).

Similarly, the original circuit in QASM format is stored in the circuit_qasm attribute.

[4]:
!jq -r ".attributes.circuit_qasm" supremacy_6qubit.zarr/zarr.json | head -n 5
OPENQASM 3.0;
include "stdgates.inc";
qubit[6] q;
h q[0];
h q[1];

Subcircuits

Number of subcircuits

[5]:
!jq -r ".attributes.n" supremacy_6qubit.zarr/subcircuits/zarr.json
3

Subcircuits are numbered from 0 to n-1, where n is the number of subcircuits.

Subcircuit metadata

The qasm3 representation of each subcircuit is stored as the qasm attribute in its numbered folder.

[6]:
!jq -r ".attributes.qasm" supremacy_6qubit.zarr/subcircuits/0/zarr.json | head -n 5
OPENQASM 3.0;
include "stdgates.inc";
qubit[4] q;
h q[0];
h q[1];

Subcircuit probabilities

The probabilities for each subcircuit are stored in the packed_probabilities group in its numbered folder. Let’s inspect the shape of the probability values for subcircuit 0.

[7]:
!jq -r ".shape" supremacy_6qubit.zarr/subcircuits/0/packed_probabilities/zarr.json
[
  4,
  4,
  4,
  8
]