Back to enclave research
Glasshouse follow-up / May 15, 2026

Same-instance GPU overhead now spans H100, Blackwell, and Vast.ai.

The first Glasshouse RunPod proof showed that protected GPU execution can complete with attestation, encrypted package release, training evidence, and zeroized cleanup. This follow-up tightens the overhead methodology: raw and protected training ran sequentially on the same provider allocation so the headline numbers no longer compare two different provider nodes. The latest rows add RunPod RTX PRO 6000 Blackwell and 15/30-minute Vast.ai A4000 checks.

Why this follow-up exists

Earlier protected-vs-raw runs compared different provider allocations. That answered whether Glasshouse could complete the lifecycle, but it mixed protection overhead with normal GPU marketplace variance. This test keeps the provider allocation fixed and changes only the execution path.

The result is narrower and stronger: a RunPod Secure H100 30-minute raw-first sequence measured 18.05%, and a RunPod Secure RTX PRO 6000 Blackwell 30-minute raw-first sequence measured 14.63%, while both completed verified attestation and zeroized cleanup.

Result table

Same-instance rows now cover three GPU/provider surfaces.

Negative rows are not a claim that protection makes training faster. They mean order and warm-state effects exceeded the measured protection cost in that short or noisy row. The 30-minute raw-first row is the most conservative current number because raw establishes the baseline before the protected lifecycle runs. The Vast.ai sustained rows are still important: they show the same lifecycle survives a second provider at 15 and 30 minutes, even when the measured delta is dominated by warm-state effects.

Provider / GPUDurationOrderProtected eps/sRaw eps/sMeasured deltaStatus
RunPod H10060sprotected first839.133937.13410.46%valid noise check
RunPod H10060sraw first916.788707.015-29.67%valid noise check
RunPod H10015mprotected first1,002.066972.938-2.99%valid sustained check
RunPod H10030mraw first580.205707.99518.05%headline sustained check
RunPod PRO 600015mraw first2,639.3412,711.9342.68%Blackwell sustained check
RunPod PRO 600030mraw first2,981.7343,492.84614.63%Blackwell sustained check
Vast.ai A400060sraw first980.756976.741-0.41%second-provider check
Vast.ai A400060sprotected first998.173989.483-0.88%second-provider check
Vast.ai A400015mraw first928.339864.427-7.39%second-provider sustained check
Vast.ai A400030mraw first980.781972.199-0.88%second-provider sustained check
Excluded result

One attempted row did not meet the evidence bar.

The protected-first 30-minute attempt is intentionally not counted. The protected segment completed and zeroized, but the post-enclave raw callback did not report completion after the provider pod exited. The harness now enforces non-RunPod provider boot timeouts too, so provider stalls fail earlier and cleanly instead of burning the outer timeout.

DurationOrderObserved behaviorDecision
30mprotected firstProtected completed and zeroized; the post-enclave raw callback never posted completion after the pod exited.Excluded from the public overhead number.
Methodology

This removes provider allocation variance, not every source of variance.

Same-instance comparison is the right next step because it avoids comparing one rented GPU against another. It still does not remove all order effects. That is why the article keeps both directionality and caveats visible.

Allocation shaperaw and protected segments ran sequentially on the same provider allocation
Provider variance removedyes, within each same-provider row no separate raw/protected GPU allocation was compared
Remaining varianceorder effects, runtime warm state, provider node behavior during the same allocation
Protected pathGlasshouse package, attestation, gated key release, evidence, zeroization
Raw pathsame MLP training loop and initial weights, no Glasshouse lifecycle
Validation ruleraw completion, attestation verified, zeroized runtime, cleanup observed
Production-model scope7B smoke and short Qwen 0.5B same-instance overhead now shown; sustained 7B fine-tune is not claimed here
Public artifact

The artifact keeps the non-secret measurement fields.

The public JSON excludes API keys, tunnel URLs, encrypted payloads, and provider credentials. It retains the measurement method, throughput values, attestation status, and cleanup result.

sanitized excerpt
{
  "provider": "runpod",
  "gpuModel": "NVIDIA RTX PRO 6000 Blackwell Server Edition",
  "sameInstance": true,
  "durationSec": 1800,
  "order": "raw-first",
  "protectedEpochsPerSec": 2981.734,
  "rawEpochsPerSec": 3492.846,
  "overheadPct": 14.63,
  "attestation": "verified",
  "runtimeState": "zeroized"
}
Production-model smoke

Qwen LoRA workloads now run through the same protected lifecycle.

The synthetic MLP remains the overhead benchmark because it is repeatable and cheap. The first real-model step is separate: Glasshouse ran a protected Qwen/Qwen2.5-0.5B-Instruct32-step LoRA workload, then same-allocation raw-vs-protected short Qwen 0.5B rows on RunPod A4000 and Vast.ai A4000. It also ran a protectedQwen/Qwen2.5-7B-Instruct BF16 LoRA smoke step on RunPod. Both emitted training progress, exported adapter digests, verified attestation, and zeroized the runtime. This is functional production-model evidence, not yet a sustained 7B fine-tune or production throughput claim.

Small modelQwen/Qwen2.5-0.5B-Instruct, 32 LoRA steps, final loss 4.1228
Small model overheadQwen 0.5B, RunPod A4000, same allocation, raw first, 8 LoRA steps, 9.86% measured train-step overhead
Second-provider QwenQwen 0.5B, Vast.ai A4000, 16 LoRA steps, both raw-first and protected-first order checks passed
7B smokeQwen/Qwen2.5-7B-Instruct, BF16, 1 LoRA step, final loss 8.3173
Provider / GPURunPod RTX 4090 for 7B smoke, RunPod A4000 and Vast.ai A4000 for short same-instance Qwen checks
Adapters270,336 trainable params on 0.5B; 1,261,568 trainable params on 7B
Evidenceattestation verified, progress events emitted, adapter SHA-256 exported, zeroized cleanup
Tamper checks

The local attestation server failed closed on nine tamper scenarios.

The live GPU proofs show provider execution, training, and zeroization. The local tamper suite isolates the attestation gate itself: bad composite hash, wrong manifest, stale timestamp, downgraded anti-debug profile, replayed nonce, and JWT claim mismatch were all rejected before key release. The only key release in the artifact is the valid measured control attestation.

Composite hash mismatchrejected
Manifest hash mismatchrejected
Container ID mismatchrejected
Stale timestamprejected
Anti-debug disabledrejected
Runtime state spoofedrejected
Anti-debug profile downgraderejected
Nonce replayrejected
JWT claim mismatchrejected
Next tests

The next jump is a real model workload.

This page is deliberately scoped to matched MLP training. The next production-relevance step is turning the 7B smoke path into a sustained fine-tune and then measuring protected-vs-raw overhead for the real-model workload.

H100 repeatCompleted in the May 16 follow-up: 2.58% overhead on a 30-minute protected-first H100 same-instance row.
Real model durationExtend Qwen LoRA from short smoke rows into sustained train-step runs before publishing overhead.
7B follow-upTurn the 7B smoke into a sustained fine-tune, then measure protected-vs-raw overhead.
Replicate headline rowsRepeat H100 and PRO 6000 30-minute rows again to get a distribution, not isolated samples.