Every quantum software stack today treats the QPU as an accelerator — a coprocessor slotted into a classical architecture. The classical system remains the skeleton. The QPU is attached where convenient.
QUASI inverts this. The QPU is the centre. The entire software architecture is designed from one question: what does the QPU need to be maximally productive?
CPUs, compilers, GPUs — they handle stabilising, statefulness, execution. They serve the QPU. Not the other way around.
The QPU is not faster. It is qualitatively different.
It can do something no classical system can: superposition, entanglement, interference. But it is fragile. It needs context. It needs translation. Like a wildly productive mind that needs fifteen people in the studio to make sure every utterance translates into consequential action.
This is not how the industry builds quantum software. The industry starts with what exists — Python SDKs, classical schedulers, runtime error mitigation — and bolts QPU access on top. It is pragmatic. It ships.
QUASI starts from the goal and works backward. What does a system look like that was designed for quantum-first computing? Not adapted from classical. Not incrementally extended. Designed. That means: physics-native input, compile-time noise rejection, enforced hardware abstraction, zero classical dependencies in the critical path.
How QUASI's architecture compares to existing quantum computing stacks, layer by layer. Each comparison follows the same structure: compilation, hardware abstraction, task orchestration, workflow representation, error handling.
Based on Seelam et al., “Reference Architecture of a Quantum-Centric Supercomputer,” arXiv:2603.10970, March 2026. Authored by IBM Quantum leadership (Gambetta, Chow, Sheldon, et al.).
IBM proposes a three-phase roadmap for integrating QPUs into HPC clusters alongside GPUs and CPUs. Key abstractions: Tensor Compute Graph (TCG), Quantum Systems API (QSA), QRMI (Slurm plugin), and three coupling modes (batch, near-time, real-time).
| IBM QCSC | QUASI | |
|---|---|---|
| Language | Python (Qiskit SDK) | Rust (Afana) |
| Input | Qiskit circuit objects | Ehrenfest CBOR binary |
| Output | Vendor-specific ISA | OpenQASM 2.0 / 3.0 |
| Noise | Runtime mitigation (TEM, Pauli propagation) | Compile-time rejection (type-level noise constraints) |
| Dependencies | Qiskit, NumPy, SciPy, vendor plugins | ciborium, quizx, serde, clap — no Python |
| Deployment | Python environment + pip | Single static Rust binary |
Key difference: IBM mitigates noise after execution. Afana rejects programs that violate their noise budget before compilation completes. Compile-time rejection is cheaper and eliminates wasted QPU time on doomed circuits.
Key difference: IBM’s compilation depends on Qiskit — a Python framework with heavy transitive dependencies. Afana is a zero-dependency Rust binary. In HPC environments, dependency-free deployment matters.
| IBM QCSC | QUASI | |
|---|---|---|
| Name | Quantum Systems API (QSA) | HAL Contract |
| Interface | Unspecified (“potentially vendor-portable”) | POST /hal/jobs (REST, vendor-neutral) |
| Neutrality | Aspirational | Enforced — compiler cannot address hardware directly |
| Enforcement | Architectural intent | CI boundary check (compiler-boundary job) |
IBMQiskit → (vendor transpiler) → QPU
QUASIEhrenfest → Afana → OpenQASM → HAL Contract → driver → QPU
Key difference: QSA is described as “potentially vendor-portable.” HAL Contract is vendor-portable by construction — there is no code path from Afana to any hardware API. Adding a new vendor requires only a new HAL driver, not compiler changes.
| IBM QCSC | QUASI | |
|---|---|---|
| Scheduler | Slurm + QRMI plugin | ActivityPub (federated) |
| Model | Centralised job queue | Decentralised activity stream |
| Lifecycle | Slurm job states | quasi:Propose → quasi:Claim → quasi:Complete |
| Multi-site | Slurm federation (complex) | ActivityPub federation (native) |
| Audit | Slurm accounting DB | Immutable ActivityPub ledger |
Key difference: QCSC bolts QPUs onto Slurm — a centralised scheduler designed for homogeneous compute nodes. QUASI uses ActivityPub, a W3C-standard federation protocol that scales naturally across institutional boundaries. For the multi-institution quantum HPC that QCSC Phase 3 envisions, federated orchestration is architecturally simpler than Slurm federation.
| IBM QCSC | QUASI | |
|---|---|---|
| Abstraction | Tensor Compute Graph (TCG) | Ehrenfest program (CBOR) |
| Scope | Full hybrid workflow (quantum + classical) | Physics-level problem specification |
| Granularity | Operation-level DAG | Hamiltonians, observables, noise constraints |
Key difference: TCG describes how to compute (operation-level DAG). Ehrenfest describes what to compute (physics-level). Afana derives the circuit from the physics specification. The same Ehrenfest program compiles to different circuits for different backends without user intervention.
| IBM QCSC | QUASI | |
|---|---|---|
| Strategy | Runtime error mitigation | Compile-time noise rejection + optional runtime mitigation |
| Techniques | TEM, Pauli propagation, hierarchical QEC | Type-level noise budget in Ehrenfest spec |
| QEC | Inner codes (FPGA) + outer codes (GPU) | Delegated to HAL drivers |
IBM’s hierarchical QEC design (inner FPGA decoding + outer GPU decoding) is well-engineered. QUASI currently delegates QEC entirely to HAL drivers, which is architecturally correct but means the HAL Contract API may need to surface QEC metadata as backends mature.
Coupling modes. IBM defines three coupling modes with clear latency requirements: batch (loose), near-time (iterative VQE/SQD), and real-time (microsecond QEC feedback). QUASI’s HAL Contract currently supports only batch mode. Near-time and real-time coupling are not yet addressed.
GPU co-processing. QCSC treats GPUs as first-class compute alongside QPUs. Tensor network error mitigation and outer QEC decoding run on co-located GPUs. QUASI has no explicit GPU compute layer.
Calibration integration. IBM stresses that real-time calibration data must flow into the compiler for optimal qubit selection. QUASI correctly places calibration in HAL drivers, but does not yet expose calibration metadata to inform compilation decisions.
Vendor neutrality (enforced, not aspirational). IBM’s QSA is “potentially vendor-portable.” QUASI’s HAL Contract is the only path to hardware, enforced by CI. There is no way to bypass it.
Compile-time noise rejection. If a program’s noise budget is infeasible for the target backend, Afana refuses to emit QASM. This eliminates wasted QPU time on circuits that would produce meaningless results.
Dependency-free deployment. Afana is a single Rust binary. Deploying it to an HPC node requires copying one file. IBM’s stack requires Python, Qiskit, NumPy, SciPy, and vendor-specific transpiler plugins.
Federated task management. ActivityPub is a W3C standard with mature implementations. Slurm federation is complex, fragile, and designed for tightly-coupled clusters.
Physics-level abstraction. Ehrenfest programs describe physics, not circuits. The compiler derives the optimal circuit for each backend. IBM’s TCG operates at the circuit level.
This paper validates the problem space QUASI operates in. IBM’s conclusion — that quantum computing needs a real systems architecture, not ad-hoc scripting — aligns exactly with QUASI’s thesis.
However, IBM’s solution is shaped by IBM’s constraints: Qiskit lock-in, centralised scheduling that reflects their HPC business, runtime mitigation that compensates for the noisy hardware they ship today, and vendor neutrality deferred to future phases.
QUASI’s architecture already enforces what QCSC only proposes.
The structural decisions — Rust compiler, HAL Contract boundary, ActivityPub orchestration, compile-time noise rejection — are not incremental improvements. They are fundamentally different design choices that become more valuable as quantum computing scales beyond single-vendor, single-site deployments.
Source: Seelam et al., “Reference Architecture of a Quantum-Centric Supercomputer,” arXiv:2603.10970, March 2026.
CUDA-Q (formerly CUDA Quantum) is NVIDIA’s open-source platform for hybrid quantum-classical computing. It provides a compiler toolchain (nvq++), GPU-accelerated simulation via cuQuantum, and a vendor-neutral backend abstraction. Apache 2.0 licensed.
The architecture is monolithic-compiler-centric: an MLIR-based pipeline (Quake dialect → QIR → target code) is the central element. There is no separate orchestration service or workflow engine — task scheduling is implicit through async execution primitives.
| CUDA-Q | QUASI | |
|---|---|---|
| Language | C++ (__qpu__) / Python (@cudaq.kernel) | Rust (Afana) |
| IR | Quake (MLIR) → QIR (LLVM) | Ehrenfest CBOR binary |
| Output | Target-specific native code | OpenQASM 2.0 / 3.0 |
| Noise | Compile-time type safety (non-copyable qubits); noise as simulation only | Compile-time rejection (type-level noise constraints) |
| Dependencies | LLVM/MLIR, cuQuantum, CUDA toolkit, Python | ciborium, quizx, serde, clap — no Python, no LLVM |
| Deployment | pip / conda / Docker / C++ installer | Single static Rust binary |
Key difference: CUDA-Q compiles at the circuit level — kernels describe gate sequences. Ehrenfest compiles at the physics level — programs describe Hamiltonians and observables. CUDA-Q users must understand hardware-specific circuit construction; Ehrenfest users specify physics and let Afana derive the circuit.
Key difference: CUDA-Q’s Quake dialect enforces quantum type safety (non-copyable qubits, value semantics) — a genuine compile-time guarantee. But it does not reject circuits based on noise feasibility. Afana goes further: if the noise budget is infeasible, compilation fails.
| CUDA-Q | QUASI | |
|---|---|---|
| Abstraction | quantum_platform class | HAL Contract |
| Interface | C++ API with integer QPU IDs | POST /hal/jobs (REST) |
| Neutrality | Vendor-neutral (~75% of public QPUs) | Vendor-neutral, enforced by CI |
| Enforcement | Runtime target selection | Compile-time boundary — no hardware API in compiler |
Key difference: Both achieve vendor neutrality, but through different mechanisms. CUDA-Q uses a monolithic compiler that emits target-specific code internally. QUASI enforces a hard architectural boundary — the compiler produces vendor-neutral OpenQASM; HAL drivers translate to hardware. Adding a backend to CUDA-Q means modifying the compiler pipeline. Adding a backend to QUASI means writing a HAL driver.
| CUDA-Q | QUASI | |
|---|---|---|
| Model | Async execution primitives (sample_async, observe_async) | ActivityPub federation |
| Multi-QPU | Integer QPU IDs, mqpu mode | Federated activity stream |
| Multi-site | MPI for multi-node GPU; no native multi-site QPU | ActivityPub federation (native) |
| HPC | Deep — CUDA, OpenMP, OpenACC, MPI | Minimal — classical compute outside scope |
Key difference: CUDA-Q has no orchestration layer — task dispatch is implicit through function calls and async primitives. This works for single-site GPU clusters but has no answer for multi-institution quantum computing. QUASI’s ActivityPub layer is designed for exactly this scenario.
| CUDA-Q | QUASI | |
|---|---|---|
| Simulation | cuQuantum: state vector, tensor network, MPS — multi-GPU, multi-node | No GPU simulation layer |
| Classical co-processing | Native CUDA/OpenMP/OpenACC for host code | Outside scope |
| Performance | Up to 180x over CPU simulators; 300x+ multi-GPU | N/A |
GPU integration is CUDA-Q’s strongest differentiator and QUASI’s largest gap. QUASI has no GPU compute layer. For workloads that require large-scale simulation (algorithm development, error mitigation research), CUDA-Q is currently without competition. QUASI’s HAL drivers could in principle delegate to cuQuantum-based simulators, but this integration does not exist today.
CUDA-Q is the most engineering-mature platform in this comparison. The MLIR-based compiler, cuQuantum simulation stack, and broad hardware support represent years of investment by the world’s largest GPU company.
The architectural difference is structural: CUDA-Q is a compiler platform that treats quantum kernels as functions within classical GPU programs. QUASI is an operating system that treats the QPU as the central resource and organises everything else around it.
CUDA-Q asks: how do we add QPUs to our GPU platform?
QUASI asks: how do we build a system that serves the QPU?
These are complementary, not competing questions. CUDA-Q’s GPU simulation and classical co-processing are capabilities QUASI will need. QUASI’s physics-level abstraction, compile-time noise rejection, and federated orchestration are capabilities CUDA-Q does not attempt.
Pasqal builds neutral-atom quantum computers and an open-source software stack centred on Pulser (pulse-level control) and Qadence (digital-analog quantum programming). The stack is Python-based and deliberately hardware-near.
Pasqal’s architecture is unique in this comparison: it is the only stack that treats analog quantum simulation — continuous Hamiltonian evolution — as a first-class computational paradigm alongside digital gates.
| Pasqal | QUASI | |
|---|---|---|
| Paradigms | Analog, digital, digital-analog (DAQC) | Physics-level specification (compiled to gates) |
| Abstraction | Pulse-level (Pulser) / block-level (Qadence) | Hamiltonian-level (Ehrenfest) |
| Language | Python | CBOR binary (Ehrenfest), compiled by Rust (Afana) |
| Hardware exposure | Deliberate — Device objects encode exact QPU constraints | Hidden — HAL Contract abstracts hardware away |
Key difference: Pasqal and QUASI share a physics-first intuition — both believe quantum programs should describe physics, not gate sequences. But they draw opposite conclusions. Pasqal exposes the hardware physics (atom positions, laser parameters, Rydberg interactions) so users can exploit them. QUASI hides the hardware physics behind HAL Contract so programs remain portable.
| Pasqal | QUASI | |
|---|---|---|
| Pipeline | Pulser: minimal — programs are already hardware-level. Qadence 2: IR-based (in development) | Ehrenfest → Afana → OpenQASM → HAL Contract |
| Output | Pulse sequences (JSON-serialised) | OpenQASM 2.0 / 3.0 |
| Noise | No compile-time noise handling | Compile-time noise budget rejection |
| Dependencies | Python, NumPy, Pulser, PyTorch (Qadence) | Zero Python dependencies |
Key difference: Pulser programs are already at hardware level — there is almost no compilation in the traditional sense. This is maximally efficient for Pasqal hardware but means programs are not portable. QUASI’s multi-stage compilation pipeline adds overhead but produces vendor-neutral output.
| Pasqal | QUASI | |
|---|---|---|
| Approach | Hardware-near — programs are written for a specific device | Hardware-agnostic — programs compiled to any device |
| Vendor scope | Pasqal hardware only | Any HAL Contract-compatible backend |
| Abstraction | Device objects with exact physical constraints | HAL Contract REST API |
Key difference: Pasqal deliberately does not abstract away the hardware. This is a valid design choice for a hardware company — their competitive advantage is in the physics of neutral atoms, and exposing it lets users exploit features like native multi-qubit gates and arbitrary 2D/3D atom topologies. QUASI’s choice is the opposite: portability over hardware-specific optimisation.
| Pasqal | QUASI | |
|---|---|---|
| Scheduler | Cloud SDK (job submission) / SLURM (HPC) | ActivityPub (federated) |
| Cloud access | Pasqal Cloud, Azure Quantum, Google Cloud, OVHcloud | Self-hosted, federated |
| HPC | SLURM-integrated (GENCI/CEA, Jülich) | ActivityPub federation |
Pasqal occupies a unique position: a hardware company with a genuinely open software stack and a physics-first programming model. The analog/DAQC paradigm is architecturally interesting because it sidesteps the gate-model abstraction entirely — the Hamiltonian is the program.
Pasqal and QUASI share the deepest intuition: quantum programs should describe physics.
They diverge on the consequence. Pasqal says: expose the specific physics of our hardware. QUASI says: describe the physics abstractly and let the compiler map it to any hardware. If neutral-atom QPUs win, Pasqal’s approach optimises harder. If the ecosystem fragments across modalities, QUASI’s abstraction pays off.
A HAL Contract driver for Pasqal hardware — translating Ehrenfest physics specifications to Pulser sequences — would be a natural integration point. The physics-to-physics mapping (Ehrenfest Hamiltonians → Rydberg Hamiltonians) is closer than the physics-to-gates mapping most backends require.
QOS is a quantum operating system built at TU Munich (TUM-DSE group), published at USENIX OSDI 2025. Its central abstraction is the “Qernel” — a DAG-based intermediate representation for quantum circuits with static and dynamic properties.
QOS is the most directly comparable system to QUASI: both position themselves as OS-level abstractions above circuit SDKs. But the approaches are fundamentally different.
| QOS / Qernel | QUASI | |
|---|---|---|
| Level | OS layer on top of Qiskit | Full-stack OS with own language and compiler |
| Foundation | Built on Qiskit / Python | Built from scratch in Rust |
| IR | Qernel IR (DAG of gates) | Ehrenfest (CBOR binary, physics-level) |
| Scope | Multi-job resource management | Full compilation + orchestration + execution |
Key difference: QOS optimises within the existing Qiskit ecosystem. QUASI builds a new ecosystem. QOS accepts Qiskit circuits as input and makes them run better on IBM hardware. QUASI accepts physics specifications and compiles them to any hardware.
| QOS / Qernel | QUASI | |
|---|---|---|
| Backends | IBM only (27-qubit Falcon QPUs) | Any HAL Contract-compatible backend (9 today) |
| Validation | 7,000+ real quantum runs, 70,000+ benchmark instances | HAL Contract tested across 9 backends |
| Vendor lock-in | Complete (Qiskit + IBM hardware) | None (enforced by CI) |
| QOS / Qernel | QUASI | |
|---|---|---|
| Multi-programming | Native — co-locates circuits on same QPU | Not addressed |
| Scheduling | Fidelity vs. latency optimisation | ActivityPub task claiming |
| Circuit cutting | Gate cutting, wire cutting, qubit reuse | Not addressed |
| Fidelity estimation | ML-based, up to 99% prediction accuracy | Compile-time noise budget |
| Noise handling | Runtime mitigation (cutting, freezing, scheduling) | Compile-time rejection |
Key difference: QOS excels at NISQ-era optimisation — making noisy circuits run better on noisy hardware through cutting, multi-programming, and fidelity-aware scheduling. QUASI takes the opposite approach: reject infeasible programs at compile time rather than optimising them at runtime. As hardware improves, compile-time rejection becomes more valuable; as hardware remains noisy, QOS’s runtime optimisation is more practical.
Real-hardware validation. QOS has been validated with 7,000+ real quantum runs on IBM hardware, demonstrating 2.6–456.5x higher fidelity and up to 9.6x better resource utilisation. QUASI’s architecture has not been validated at this scale on real QPUs.
Multi-programming. Co-locating compatible circuits on the same QPU is a practical capability for cloud-scale quantum computing. QUASI does not address this.
Peer review. Published at OSDI 2025, one of the top systems conferences. This is a level of academic validation QUASI does not yet have.
Hardware independence. QOS is locked to IBM/Qiskit. QUASI’s HAL Contract supports 9 backends today with no vendor dependency in the compiler.
Own language and compiler. QOS accepts Qiskit circuits — it cannot change the input abstraction. QUASI controls the full stack from physics specification to hardware execution.
Production readiness. QOS is a research prototype (11 commits on GitHub). QUASI’s HAL Contract implementation (Arvak) is at v1.8.1 with 9 production backends.
Federated orchestration. QOS has no multi-site capability. QUASI’s ActivityPub-based orchestration is designed for cross-institutional quantum computing.
QOS and QUASI answer different questions. QOS asks: given the hardware and tools we have today, how do we extract maximum value? QUASI asks: what should the system look like if we design it from scratch?
QOS optimises the present. QUASI designs for the future.
QOS’s NISQ-era optimisations (circuit cutting, multi-programming, fidelity-aware scheduling) are genuinely valuable today. But they are workarounds for noisy hardware — they become less relevant as error-corrected machines arrive. QUASI’s architectural decisions (physics-level abstraction, vendor neutrality, compile-time guarantees) become more relevant as the ecosystem matures.
Source: Giortamis et al., “QOS: A Quantum Operating System,” USENIX OSDI 2025. arXiv:2406.19120.