The earlier branchd pool kept hot branch roots ready for immediate lease. That removed source copying from the request path, but a full 100-branch refill could still show up as an 8-second wait in the harder DDL benchmark. The new cold pool adds another layer: branchd prepares cold snapshots first, promotes them into hot branches, then starts Postgres and preconnects only when hot capacity needs to be restored.
The intent is simple: keep the expensive source snapshot work ahead of demand, keep the user-facing branch lease governed, and keep proof checks attached to create, write, cleanup, and source-isolation events.
The cold reserve worked, then exposed the real host ceiling.
The direct proof run used 20 hot branches and 20 cold reserve branches against the verified 5B source. It completed 40/40 governed branches with 0 deadlocks and 0 source mutations. Create latency measured 81.86ms p50 / 82.39ms p95. First query measured 49.56ms p50 / 57.69ms p95, and branch-local writes measured 15.05ms p50 / 57.41ms p95.
The next two runs intentionally pushed harder: 100 concurrent DDL branches over a hot pool with either 100 or 400 cold reserve branches behind it. Both completed 500/500 branch operations with 0 deadlocks and 0 source mutations. The p95 create and write times stayed in the hundreds of milliseconds, but ready-pool refill waited around 10.8s to 12.9s p95.
That is the honest limitation. Cold snapshots remove a class of source-copy work, but this budget host still has to juggle hundreds of branch Postgres processes, DDL writes, background refill, cold refill, and cleanup on 8 vCPUs.
| Run | Success | Ready wait p50 / p95 | Create p50 / p95 | First query p50 / p95 | Write p50 / p95 | Source mutations |
|---|---|---|---|---|---|---|
| 20 hot / 20 cold event-insert proof | 40/40 | 75.3ms / 4243.38ms | 81.86ms / 82.39ms | 49.56ms / 57.69ms | 15.05ms / 57.41ms | 0 |
| 100 hot / 100 cold DDL reserve | 500/500 | 12831.45ms / 12863.87ms | 240.11ms / 282.23ms | 192.07ms / 433.39ms | 194.32ms / 450.31ms | 0 |
| 100 hot / 400 cold delayed DDL reserve | 500/500 | 10764.19ms / 12962.28ms | 305.51ms / 437.05ms | 176.56ms / 585.29ms | 210.54ms / 622.89ms | 0 |
This shifted the bottleneck to scheduling.
| Dimension | Evidence | Conclusion |
|---|---|---|
| Cold reserve path | Branchd can promote a cold Btrfs snapshot into the hot pool without re-copying the 5B source. The proof run kept cloneMs=0 and snapshotMs=0 during cold promotion. | The next pool can be prepared before a user request arrives. |
| Governed proof path | The 20 hot / 20 cold event-insert run completed 40/40 branches with source mutation checks, deadlock checks, branch-local write checks, and cleanup checks. | The cold-pool path still preserves Imladri proof semantics. |
| DDL stress path | The 100-way DDL runs completed 500/500 branches with 0 deadlocks and 0 source mutations, but ready wait stayed around 10.8s to 12.9s p95. | The bottleneck moved from source copying to host scheduling and process pressure. |
| Budget host ceiling | The current machine is an 8-vCPU, 32 GiB GCP N2D host with striped Local SSD. It is good enough to prove correctness, but not the final performance envelope. | A 16 to 32 vCPU Local SSD host is the correct next measurement when budget allows. |
This is not the final storage-engine number.
The cold pool proves the design direction, not the final infrastructure limit. The current host was chosen because it fit the budget: 8 vCPUs, 32 GiB RAM, and striped Local SSD. It is enough to test correctness and isolate the next bottleneck, but a serious vendor-grade comparison needs the same code on a higher-core storage host.
| Not a broad infrastructure claim | This result proves the cold reserve design on one budget GCP host. It does not claim universal sub-six-second cloning across arbitrary production databases. |
|---|---|
| Active branch count still matters | Hundreds of resident branch Postgres processes compete with hot refill, cold refill, cleanup, and DDL writes on an 8-vCPU machine. |
| DDL is intentionally harsh | The DDL benchmark mutates branch-local schema under 100-way pressure. Normal event inserts are faster, but schema-heavy agent work needs separate policy and scheduling. |
| Higher-core host needed later | The honest next pass is the same branchd configuration on a higher-core storage host. We kept this note because current budget blocks that upgrade today. |
The public artifacts include all three cold-pool runs.
The JSON artifacts include success counts, p50/p95 timings, pool readiness, cold-pool readiness, deadlock counts, source mutation checks, cleanup state, and branchd probe state after drain.
