Back to sandboxing
Database sandbox / May 16, 2026

Imladri moved governed Postgres FK branches onto a prepared COW pool.

After the FK correctness fix, the next bottleneck was not source mutation risk. It was request-path branch materialization. This follow-up moved copy-on-write schema preparation into a warm pool, leased prepared branch schemas under 100-way contention, and kept the same database safety checks: branch-local FK behavior, zero deadlocks, and zero source mutations.

Why this is separate

The previous article closed the FK behavior gap. This one is about production ergonomics: if 100 agents need governed database branches at once, the system should not build every branch schema from scratch on the critical path. The latest run added source schema fingerprints, disabled benchmark refill debt, and kept the hot path out of Postgres catalog checks.

The prepared pool does not weaken the proof model. It still leases a governed COW branch, records the source metadata, executes branch-local writes, cleans up, and verifies that the source relation stayed unchanged.

Result

The 100-way FK create p95 moved to 53.34ms.

The same ten-table FK fixture was tested across three implementation shapes. Eager materialization was correct but expensive. Targeted lazy materialization avoided unnecessary table work. The prepared pool moved most remaining clone work before the request. The final row is the latest clean run: refill disabled for benchmark hygiene, schema fingerprints cached, zero deadlocks, and zero source mutations.

ModeSuccessFK create p50 / p95Valid write p50 / p95Source mutations
Eager FK materialization100 / 1009769.45ms / 29179.01ms3921.18ms / 11530.44ms0
Targeted lazy materialization100 / 1001261.2ms / 3447.57ms5002.81ms / 7566.71ms0
Prepared COW pool100 / 100403.91ms / 710.04ms725.35ms / 922.46ms0
Prepared pool + schema cache100 / 10049.4ms / 53.34ms578.37ms / 1048.83ms0
Phase split

The expensive clone moved out of the hot path.

The final run hit the prepared pool for every concurrent FK branch. The remaining branch-create path binds the leased branch, confirms the source schema fingerprint, and records the proof packet.

Phasep50 / p95MaxMeaning
Prepared pool hit1 / 11 / 1All 100 FK branches leased prebuilt schemas.
Clone schema build1 / 11 / 1Clone build stayed off the request path.
Branch DDL attach101.56ms / 178.88ms547.51ms maxMeasured as the underlying attach work while leased branch create stayed at 49.4ms p50.
Source metadata1.32ms / 1.65ms2.66ms maxSource schema fingerprints were served from the shared cache.
Storage branch total48.17ms / 51.78ms52.31ms maxEnd-to-end branch creation under 100-way FK contention.
Correctness

The speedup did not remove the FK guardrails.

The prepared branch still enforced the same branch-local FK behavior as the correctness article. The benchmark completed 100 simultaneous FK branches, allowed valid branch writes, rejected orphan writes, applied update/delete cascades, and left the source unchanged. Cascade p95 moved from 3235.59ms to 493.36ms, and cleanup p95 moved from 3052.3ms to 249.34ms.

CheckResultSource state
Concurrent FK branches100 / 100 succeeded0 deadlocks
Orphan writerejected in branchsource untouched
Restrict update/deleterejected in branchsource untouched
Cascade update/deleteapplied branch-locallysource untouched
Set null/defaultapplied branch-locallysource untouched
Deferred / MATCH FULL / self-refpassedsource untouched
Bulk write follow-up

The 5B-source write bottleneck had two layers.

The first bulk-write probe did not reveal source mutation. It revealed a performance bug: source expression indexes existed, but the branch overlay did not have matching lookup indexes. As the overlay grew, branch-local uniqueness checks scanned more branch rows. After adding overlay lookup indexes for source unique and expression indexes, the next bottleneck was clearer: large INSERT ... SELECT workloads were still moving through the row-level COW view trigger. The current fix adds a set-based lazy-COW bulk insert path and reruns 1M and 10M branch-local writes against the same 5B-source fixture.

1M write12,877.31ms
1M throughput77,656 rows/sec
10M write138,403.08ms
10M throughput72,253 rows/sec
10M overlay rows10,000,000
10M cleanup39.61ms
Proof7/7
Source mutations0
Clone vs write

Fast branching and heavy branch writes are different claims.

The sub-100ms branch numbers in this article are about governed branch creation over approved tables. The 12.88s and 138.4s numbers are heavy branch-local writes after the branch exists. They do not disprove fast clone work, but they also do not make Imladri an Ardent- or Neon-class full-database clone engine by themselves. The new backend seam for that next layer is external_snapshot_command: Glasshouse can now delegate branch create/destroy to a storage provider while keeping Imladri policy, transaction, cleanup, and proof around the branch lifecycle. A follow-up self-hosted Btrfs droplet 100-sample repeat reached 1106.76ms p50 / 1666.56ms p95 branch create on the existing shared runtime droplet. Moving the same 10M-row loopback proof to a dedicated 4 vCPU / 8GB droplet improved that to 699.54ms p50 / 1150.54ms p95, with 0 source mutations.

LayerWhat it doesClaim status
Prepared COW schemaGoverned Postgres branch over approved tables with FK, trigger, RLS, write-control, cleanup, and proof checks.Measured here: 100-way branch creation and branch-local write behavior.
External snapshot commandNew backend seam that delegates create/destroy to a storage or page-branch provisioner and captures the returned branch connection string.Follow-up proofs: dedicated loopback measured 699.54ms p50 / 1150.54ms p95; the real attached-volume run measured 799.86ms p50 / 997ms p95. Both used 10M-row fixtures and had 0 source mutations.
Bulk branch writesThe 1M/10M measurements are branch-local INSERT ... SELECT workloads after a branch exists.They are not the same metric as full database clone time.
Physical branch concurrency

The provider coordination bug is fixed; prepared physical branches cut the tail.

The first 25-way physical branch run exposed a real provider race: concurrent creates could reserve the same state window, fail, and leave branch subvolumes behind. The provider now reserves ports and state under an atomic lock, writes state atomically, and destroys by deterministic branch path if state is missing. After that patch, both the shared runtime droplet and the dedicated DB-sandbox droplet completed 25/25 concurrent branches with 0 deadlocks and 0 source mutations. The dedicated host reduced the 25-way create p95 from 60488.22ms to 10748.29ms. It is still loopback Btrfs, not a final attached-volume benchmark, but it proves the high tail was mostly host/contention pressure rather than a mutation safety failure.

HostRunSuccessCreate p50 / p95Integrity
Shared runtime loopback5-way5/54659.37ms / 5186.03ms0 deadlocks / 0 source mutations
Shared runtime loopback10-way10/108725.63ms / 10340.99ms0 deadlocks / 0 source mutations
Shared runtime loopback25-way25/2543903.14ms / 60488.22ms0 deadlocks / 0 source mutations
Dedicated droplet loopback5-way5/51123.65ms / 1270.88ms0 deadlocks / 0 source mutations
Dedicated droplet loopback10-way10/102211.99ms / 2600.4ms0 deadlocks / 0 source mutations
Dedicated droplet loopback25-way25/257195.72ms / 10748.29ms0 deadlocks / 0 source mutations

The next pass warmed physical branches before the request path and leased them under the same create/destroy contract. That removes checkpoint, snapshot, and Postgres start from the hot path. On the dedicated loopback host, 20 concurrent warmed physical branches stayed under six seconds p95 while keeping 0 deadlocks and 0 source mutations. At 25-way, the host still crossed six seconds p95, so 25-way is a correctness result rather than the speed envelope.

Prepared pool runSuccessCreate p50 / p95Checkpoint / start on requestIntegrity
5-way pooled5/5468.87ms / 621.14ms0ms / 0ms0 deadlocks / 0 source mutations
10-way pooled10/101021.21ms / 1476.83ms0ms / 0ms0 deadlocks / 0 source mutations
20-way pooled20/203037.04ms / 5283.85ms0ms / 0ms0 deadlocks / 0 source mutations
25-way pooled25/253764.22ms / 6401.23ms0ms / 0ms0 deadlocks / 0 source mutations

The attached-volume run then moved the same provider from a loopback Btrfs file to a real DigitalOcean block volume. That pass caught two host-realism bugs before publication: long benchmark branch names exceeded PostgreSQL's Unix socket path limit, and /tmp socket directories needed to be owned by the branch Postgres user. After both fixes, the attached volume completed 100 serial branches and 20/25 concurrent warmed branches with 0 source mutations. On this 4 vCPU / 8GB droplet, 20-way is the honest sub-six-second envelope; 25-way is a correctness result that needs a larger storage host or smarter request scheduling.

Real volume runSuccessCreate p50 / p95Snapshot or hot-path cloneIntegrity
Attached volume serial100/100799.86ms / 997ms53.19ms / 121.44ms0 source mutations
Attached volume pooled 20-way20/203153.63ms / 4768.74ms0ms / 0ms0 deadlocks / 0 source mutations
Attached volume pooled 25-way25/254545.72ms / 7130.54ms0ms / 0ms0 deadlocks / 0 source mutations
Bugs caught

The pool caught production-class invalidation bugs before publication.

This is why the prepared-pool note is useful: the benchmark did not only produce faster numbers. It found production-class failure modes in branch naming, schema validation, benchmark refill behavior, and numeric fixture compatibility.

BugWhat happenedFix
Pool-name collisionThe first warm-pool naming pass let the storage-name normalizer truncate random suffixes, making prepared branch schema names collide under burst load.The random suffix now lands before the truncation boundary, so warmed branch schemas stay unique.
Hot-path catalog validationA defensive per-lease information_schema check pushed simple prepared branches back into hundreds of milliseconds under 100-way load.The hot path now trusts in-process warm entries and invalidates against source schema fingerprints instead.
Background refill debtThe service refill loop correctly replenished warm branches, but benchmark runs were left with prepared schemas after completion.The benchmark can disable refill while the production path keeps delayed refill for long-running services.
Numeric benchmark IDsThe 5B coverage verifier added a numeric expression index, and the heavy-write benchmark generated non-numeric branch IDs.The benchmark now generates IDs that stay inside the numeric coverage contract.
Overlay unique lookupA 5B-source bulk-write probe became slow because branch-local expression uniqueness was scanning the growing overlay row by row.The branch overlay now gets lookup indexes that mirror source unique/expression indexes, removing the growing-overlay scan before the set-based bulk path.
Trigger-per-row bulk pathAfter the lookup fix, large INSERT ... SELECT workloads were still slow because the COW view trigger fired once per inserted row.Lazy COW branches now use a set-based bulk insert path: resolve source metadata, validate uniqueness in sets, then insert directly into the overlay.
Postgres socket path lengthThe first attached-volume benchmark failed because long generated branch names pushed the Unix socket path over the PostgreSQL limit.Branch Postgres sockets now live under a short /tmp/imladri-pg-<port> path instead of inside the branch data directory.
Socket ownership on real hostsMoving sockets to /tmp exposed a real host permission issue: the provider created the socket directory as root while branch Postgres ran as the imladri user.The provider now gives the branch Postgres user ownership of the socket directory, or falls back to a writable mode when no run user is configured.
Limitations

The honest next target is partner-workload write throughput.

Prewarm cost is realThe pool shifts COW materialization out of the hot path. Production still needs sizing and refill policy per customer workload.
Bulk writes are now boundedThe follow-up set-based path completed 1M branch-local rows in 12.88s and 10M rows in 138.4s with 0 source mutations. Partner workloads that need higher sustained write throughput should get COPY-style ingestion and workload-specific overlay indexes.
Sub-six-second full clone needs a storage backendPrepared COW branches prove governed table-level branching. The physical branch pool reached 20 concurrent warmed physical branches at 4768.74ms p95 on a real attached Btrfs volume; 25-way stayed correct but crossed the speed envelope on the current 4 vCPU / 8GB droplet.
Correctness stayed intactThe prepared-pool run still passed FK behavior, branch-local integrity, 0 deadlocks, 0 source mutations, and 0 scratch-schema leaks.
Evidence

The benchmark artifact is public.

The JSON artifact records the prepared pool hits, phase timings, FK behavior booleans, 100-way concurrency result, deadlock count, source mutation count, and the 1M/10M-row bulk-write follow-up. The physical branch artifact records the 100-sample self-hosted Btrfs droplet repeat and the checkpoint/snapshot/Postgres-start phase split. The dedicated droplet artifacts record the same provider first on an isolated loopback Btrfs mount, then on a real attached DigitalOcean Btrfs block volume.