Back to research
Database sandbox / May 9, 2026

Imladri branched governed Postgres sandboxes over 1 billion rows with a 24.07ms p50.

Agents need database workspaces that are fast, disposable, and governed. This run tested whether Imladri's database sandbox could branch a verified billion-row Postgres source, allow branch-local writes, prove the source stayed untouched, and clean up afterward.

Abstract

We created a synthetic Postgres source table with exactly one billion generated rows and rebuilt its primary key after proving the generator at 1M, 10M, and 100M rows. The benchmark then created a governed copy-on-write branch through Imladri's storage_branch API, executed an approved write transaction inside the branch, confirmed the source table did not change, destroyed the branch, and exported the proof artifact.

The result is the useful shape for database agents: not a giant full copy, but a governed branch that can be created, written to, proven, and cleaned up while preserving the production source boundary.

Scale ladder

We stepped through 1M, 10M, 100M, then 1B rows.

The billion-row result was not a single blind jump. We used smaller runs to catch fixture and proof issues first, then promoted the same benchmark path to the D: volume for the final 1B-row run.

RunRowsSource sizeCOW createTxCleanupSource mutationsProof
1M1,000,0000.13 GiB17.15ms8.78ms5.81ms06/6
10M10,000,0001.31 GiB17.97ms14.68ms7.55ms06/6
100M100,000,00014.86 GiB26.42ms12.48ms56.38ms06/6
1B1,000,000,000136.77 GiB22.96ms12.67ms7.10ms06/6
Bug caught and fixed

The scale ladder found a real fixture bug before the 1B run.

The 10M-row step exposed an ID collision at the first nine-digit boundary. That was exactly why the benchmark needed row coverage proof before publishing timing numbers.

CaughtThe first high-scale fixture rebuild failed when the synthetic ID generator created a duplicate key at B10000000.
FixedIDs now preserve every digit past the eight-digit padding boundary, so 10M, 100M, and 1B rows keep a unique contiguous ID range.
GuardedThe benchmark now refuses to report timing until exact row count, valid ID count, invalid ID count, min/max ID range, and source fingerprint checks pass.
01

Build a real billion-row source

The benchmark loaded public.imladri_bench_customers into a separate local Postgres 17 cluster on the D: volume, then rebuilt the primary key over all generated IDs.

02

Prove coverage before timing

The run performed exact count and ID-range coverage checks: one billion exact rows, one billion valid generated IDs, zero invalid IDs, and min/max IDs matching 1 through 1,000,000,000.

03

Create the governed branch

The storage_cow backend created a copy-on-write Postgres branch through the same Imladri storage_branch API used by the product sandbox flow.

04

Write without touching source

The branch transaction inserted and updated inside the sandbox, returned proof metadata, then verified source mutations stayed at zero.

05

Clean up and export evidence

The branch was destroyed, cleanup proof completed, and the JSON artifact was published for inspection.

Coverage proof

How we know rows were not skipped.

The benchmark does not rely on Postgres estimates. Before reporting the branch timings, it performs exact coverage checks over the generated ID space. Because the rebuilt primary key enforces uniqueness, the combination of exact count, valid generated ID count, invalid ID count, and min/max range proves the fixture covers every generated ID from 1 through 1,000,000,000.

Exact rows1,000,000,000
Valid generated IDs1,000,000,000
Invalid IDs0
Min / max ID1 / 1,000,000,000
Coverage fingerprint3644389ad3cdf775508ce91123480b29eaa490221dbdfb49eddbbf2d6f867bbe
Benchmark result

Copy-on-write is the scale path.

The first billion-row artifact used one iteration, so its min, p50, p95, and max fields are identical by definition: one sample has only one value. We kept that full comparison artifact, then ran a focused twenty-sample repeat check on storage_cow only.

The full template database branch timed out, which is the expected failure mode for a full-copy approach. The copy-on-write storage branch stayed size-independent at the branch creation boundary.

CaseBackendCreateTxCleanupSource mutationsProof
storage_cowpostgres_copy_on_write_schema22.96ms12.67ms7.10ms06/6
copy_on_writeoverlay schema657.21ms18.96ms7.36ms06/6
materializedbounded 5k-row copy2998.13ms67.42ms48.16ms06/6
storage_templatefull template databasetimed outn/an/an/an/a
Repeat timing check

Twenty storage-COW samples over the verified 1B source.

The repeat run reused the verified coverage artifact instead of rescanning the 136.77 GiB source table, then executed twenty governed storage-COW branches against the same source.

The first sample was the cold max. After that warmup, create times stayed between 22.66ms and29.56ms. Every sample preserved the source with0 mutations and kept the proof score at6/6.

Samples20
Create min / p50 / p95 / max22.66ms / 24.07ms / 29.56ms / 210.63ms
Tx min / p50 / p95 / max9.42ms / 11.19ms / 16.82ms / 36.54ms
Cleanup min / p50 / p95 / max7.44ms / 9.22ms / 17.90ms / 21.02ms
Cold first-run max210.63ms create / 36.54ms tx / 17.90ms cleanup
Subsequent create range22.66ms to 29.56ms
Source mutations0
Proof checks6/6
RunCreateTxCleanupSource mutationsProof
1210.63ms36.54ms17.90ms06/6
222.74ms16.82ms8.23ms06/6
325.66ms11.86ms9.42ms06/6
422.66ms13.03ms7.70ms06/6
525.43ms13.05ms8.50ms06/6
623.68ms10.59ms9.05ms06/6
723.51ms12.13ms21.02ms06/6
824.07ms10.68ms12.45ms06/6
923.53ms11.19ms11.33ms06/6
1024.63ms9.68ms10.69ms06/6
1126.75ms11.59ms10.64ms06/6
1226.88ms9.61ms7.44ms06/6
1329.56ms14.78ms9.59ms06/6
1424.28ms12.90ms9.22ms06/6
1522.93ms9.68ms9.16ms06/6
1623.57ms10.00ms9.94ms06/6
1723.89ms11.94ms8.20ms06/6
1827.05ms10.01ms8.57ms06/6
1926.10ms9.42ms9.29ms06/6
2022.68ms10.09ms7.92ms06/6
Open repeat timing JSON
What this means

Database agents need governed branches, not shared production writes.

Coding and data agents increasingly need to test migrations, generated SQL, cleanup jobs, and application code against realistic data. The dangerous pattern is letting an agent touch production state directly. Imladri's database sandbox gives the agent a branch, routes writes through the policy boundary, records what happened, and destroys the workspace after the run.

This does not replace purpose-built storage engines or snapshot systems. It gives Imladri a governed API layer that can sit above those engines: create branch, permit or deny actions, record proof, clean up, and show what changed.

Evidence

Download the artifact.

The public JSON contains the generated timestamp, fixture size, row coverage, benchmark samples, backend failure, and proof scores from the run.

Open JSON evidence
Caveats

What this does and does not prove.

  • This is a local Postgres copy-on-write benchmark, not a managed distributed database benchmark.
  • The source table was synthetic and about 136.77 GiB, not a multi-terabyte customer production database.
  • Full template database branching timed out, which is expected. The result supports the COW/storage-branch direction, not full-copy cloning.
  • The long part was building and verifying the billion-row fixture. The measured branch creation happened after that verification completed.