Bank Transaction Benchmark¶
A realistic banking workload benchmark comparing five embedded key-value storage engines on atomic transactional operations modeled after TPC-B.
Workload¶
Each successful transfer performs 5 key-value operations in a single atomic transaction:
- Read source account balance
- Read destination account balance
- Update source balance (debit)
- Update destination balance (credit)
- Insert transaction log entry (big-endian sequence number key with transfer details)
This mirrors the TPC-B debit-credit pattern (3 updates + 1 select + 1 insert) and exercises both random-access updates and sequential-key inserts within the same transaction.
- 1,000,000 accounts with random names (dictionary words + synthetic binary/decimal keys)
- 10,000,000 transfer attempts per phase (6,856,951 successful in write-only phase)
- Triangular access distribution -- some accounts are "hot," mimicking real-world Pareto-like skew
- Deterministic -- identical RNG seed ensures every engine processes the exact same workload
- Validated -- balance conservation and transaction log entry count verified after completion
Fairness Controls¶
All engines use identical batching and sync parameters:
| Parameter | Value |
|---|---|
| Batch size | 100 transfers per commit |
| Sync frequency | Every 100 commits |
| Sync mode | none (no forced durability) |
| Initial balance | 1,000,000 per account |
| RNG seed | 12345 |
Results: Apple M5 Max (ARM64, macOS)¶
Transaction Throughput¶
The core metric -- sustained transfers per second over 10M operations. Each successful transfer performs 2 reads + 2 updates + 1 insert.
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB'}}}}%%
xychart-beta
title "Transaction Throughput — Write-Only (transfers/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Transfers per second" 0 --> 400000
bar [376691, 356703, 225937, 126341, 57138]
| Engine | Transfers/sec | KV Ops/sec | Relative |
|---|---|---|---|
| PsiTri | 376,691 | 1,883,455 | 1.00x |
| PsiTriRocks | 356,703 | 1,783,515 | 0.95x |
| TidesDB | 225,937 | 1,129,685 | 0.60x |
| RocksDB | 126,341 | 631,705 | 0.34x |
| MDBX | 57,138 | 285,690 | 0.15x |
Each transfer performs 5 key-value operations (2 reads + 2 updates + 1 insert), so the effective KV ops/sec is 5x the transfer rate.
PsiTri's DWAL layer absorbs burst writes in an in-memory ART (adaptive radix trie) buffer, then drains them to the COW trie via background merge threads. This decouples write latency from COW cost -- the writer thread never blocks on trie mutations. There are no compaction stalls and no memtable flushes; the ART buffer provides sub-microsecond commit latency while the merge pipeline sustains throughput.
Bulk Load¶
Inserting 1M accounts with initial balances in a single batch transaction.
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#16A34A'}}}}%%
xychart-beta
title "Bulk Load — 1M Accounts (ops/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Operations per second" 0 --> 3500000
bar [3060276, 1795315, 959912, 1961333, 3078055]
| Engine | Time | Ops/sec |
|---|---|---|
| PsiTri | 0.33s | 3.06M |
| MDBX | 0.33s | 3.08M |
| RocksDB | 0.51s | 1.96M |
| PsiTriRocks | 0.56s | 1.80M |
| TidesDB | 1.04s | 0.96M |
Transaction Time¶
Wall-clock time for the 10M transfer phase.
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#DC2626'}}}}%%
xychart-beta
title "Transaction Phase Wall Time — Write-Only (seconds)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Seconds" 0 --> 180
bar [26.547, 28.034, 44.260, 79.151, 175.012]
| Engine | Time | vs. PsiTri |
|---|---|---|
| PsiTri | 26.5s | -- |
| PsiTriRocks | 28.0s | +5.6% |
| TidesDB | 44.3s | +66% |
| RocksDB | 79.2s | +198% |
| MDBX | 175.0s | +558% |
PsiTri completes the same work in one-sixth the time MDBX requires.
Validation Scan¶
Full scan reading all 1M accounts plus ~6.9M transaction log entries.
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#9333EA'}}}}%%
xychart-beta
title "Validation Scan — 7.9M Entries (ops/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Operations per second" 0 --> 115000000
bar [27619781, 37644460, 1843029, 14153353, 109856761]
| Engine | Time | Ops/sec |
|---|---|---|
| MDBX | 0.072s | 109.9M |
| PsiTriRocks | 0.209s | 37.6M |
| PsiTri | 0.284s | 27.6M |
| RocksDB | 0.555s | 14.2M |
| TidesDB | 4.263s | 1.84M |
MDBX dominates sequential scanning -- its B+tree stores keys in sorted order with contiguous leaf pages.
Storage Efficiency¶
Reachable Data Size¶
The theoretical minimum raw data size is 275 MB. This chart shows bytes occupied by live, reachable objects.
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB'}}}}%%
xychart-beta
title "Reachable Data Size (MB)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Megabytes" 0 --> 400
bar [303, 304, 263, 320, 366]
| Engine | Reachable Data | vs. Theoretical (275 MB) | File Size |
|---|---|---|---|
| TidesDB | 263 MB | 0.96x | 263 MB |
| PsiTri | 303 MB | 1.10x | 1,088 MB |
| PsiTriRocks | 304 MB | 1.11x | 1,210 MB |
| RocksDB | 314 MB | 1.14x | 320 MB |
| MDBX | 366 MB | 1.33x | 640 MB |
PsiTri's reachable data is only 1.10x the theoretical minimum, thanks to graduated leaf node sizing.
File Size¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#DC2626'}}}}%%
xychart-beta
title "On-Disk File Size (MB)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Megabytes" 0 --> 1300
bar [1088, 1210, 263, 320, 640]
PsiTri's file size is 3.6x its reachable data. The dead space consists of copy-on-write duplicates awaiting compaction and allocator free space within segments. The background compactor keeps growth bounded during sustained write workloads.
Concurrent Read Performance¶
The benchmark runs a second phase with a concurrent reader thread performing Pareto-distributed point lookups.
Write Throughput Under Read Load¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB'}}}}%%
xychart-beta
title "Write Throughput: Write-Only vs Write+Read (tx/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Transfers per second" 0 --> 400000
bar [376691, 356703, 225937, 126341, 57138]
bar [382416, 250564, 228892, 124864, 51858]
| Engine | Write-Only | Write+Read | Write Impact | Reader reads/sec |
|---|---|---|---|---|
| PsiTri | 376,691 | 382,416 | +1.5% | 1,719,249 |
| PsiTriRocks | 356,703 | 250,564 | -29.8% | 1,403,776 |
| TidesDB | 225,937 | 228,892 | +1.3% | 457,579 |
| RocksDB | 126,341 | 124,864 | -1.2% | 329,917 |
| MDBX | 57,138 | 51,858 | -9.2% | 1,750,952 |
PsiTri shows zero write degradation from concurrent reads. Its memory-mapped MVCC architecture means readers access the same physical pages with no locking.
Reader Throughput¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#9333EA'}}}}%%
xychart-beta
title "Concurrent Reader Throughput (reads/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Reads per second" 0 --> 1800000
bar [1719249, 1403776, 457579, 329917, 1750952]
Summary¶
| Engine | Architecture | Strength | Weakness |
|---|---|---|---|
| PsiTri | Radix/B-tree hybrid, mmap copy-on-write | Fastest transactions (377K/s), zero read contention | Larger file footprint |
| PsiTriMDBX | PsiTri via MDBX API shim | 12x faster writes than MDBX, drop-in replacement | Reads 10–20% slower than native MDBX in buffered mode |
| PsiTriRocks | PsiTri via RocksDB API shim | Drop-in RocksDB replacement | 29% write penalty with concurrent reads |
| TidesDB | Skip-list + SSTables | Good tx speed (226K/s), compact | Slow scan, 100K txn op limit |
| RocksDB | LSM-tree | Compact storage, minimal read impact | 3.0x slower than PsiTri |
| MDBX | B+tree, MVCC copy-on-write | Fastest scan (110M/s) and reads (1.75M/s) | 6.6x slower transactions |
All five engines pass validation: balance conservation verified and transaction log entry counts match across both phases.
PsiTriMDBX: Drop-In MDBX Replacement¶
PsiTriMDBX provides an MDBX-compatible C/C++ API backed by PsiTri's DWAL layer. The same
mdbx_put / mdbx_get / mdbx_cursor_get calls work unchanged — only the linker target
changes. This section compares PsiTriMDBX against native libmdbx using the identical bank
benchmark binary (same source, different link target).
Workload: 100K accounts, 1M transactions, batch=100, sync=none, seed=12345.
Write Throughput¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#DC2626'}}}}%%
xychart-beta
title "MDBX API: Write-Only Throughput (tx/sec)"
x-axis ["PsiTriMDBX", "Native MDBX"]
y-axis "Transfers per second" 0 --> 500000
bar [452716, 37853]
| Engine | Write-Only | Write+Read (latest) | Ratio |
|---|---|---|---|
| PsiTriMDBX | 452,716 tx/s | 373,476 tx/s | 12x / 4.7x faster |
| Native MDBX | 37,853 tx/s | 79,081 tx/s | 1.0x |
PsiTriMDBX writes are 12x faster than native MDBX. The DWAL layer absorbs writes into an in-memory btree and batches them for background merge into the persistent trie, avoiding MDBX's page-level copy-on-write overhead.
Read Modes¶
PsiTriMDBX exposes three read modes that control which DWAL layers are searched on each
mdbx_get call (see MDBX migration guide):
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#16A34A,#7C3AED,#DC2626'}}}}%%
xychart-beta
title "MDBX API: Reader Throughput by Read Mode (reads/sec)"
x-axis ["Trie", "Buffered", "Latest", "Native MDBX"]
y-axis "Reads per second" 0 --> 1600000
bar [1534646, 1165690, 323626, 1276659]
| Read Mode | Layers | Reader (reads/s) | Write+Read (tx/s) | Write Impact |
|---|---|---|---|---|
| Trie | Tri only | 1,535K | 260K | -42% |
| Buffered | RO + Tri | 1,166K | 254K | -39% |
| Latest | RW + RO + Tri | 324K | 373K | -20% |
| Native MDBX | B+tree | 1,277K | 79K | +109% |
- Trie mode reads only the merged COW trie — 20% faster than native MDBX on point lookups, with zero lock contention. The tradeoff: very recently committed data may not be visible until the background merge completes.
- Buffered mode adds the RO btree snapshot — within 10% of native MDBX read speed, still with no lock contention with writers.
- Latest mode checks all three layers including the active RW btree (shared lock). Lowest write impact (-20%) but slowest reads due to writer contention.
Total Wall Time¶
| Engine | Total | vs. Native MDBX |
|---|---|---|
| PsiTriMDBX (latest) | 4.9s | 8.3x faster |
| PsiTriMDBX (buffered) | 6.8s | 6.0x faster |
| Native MDBX | 40.8s | 1.0x |
Both engines pass validation: balance conservation and log entry counts verified.
Results: AMD EPYC-Turin (x86-64, Linux)¶
Environment¶
| Component | Spec |
|---|---|
| CPU | AMD EPYC-Turin, 8 cores / 16 threads (SMT), 2.4 GHz |
| ISA extensions | SSE2, SSE4.1, SSSE3, AVX2, AVX-512F/BW/DQ/VL/VBMI/VNNI |
| RAM | 121 GB |
| Storage | 960 GB virtual disk (cloud VM — Linux 6.17, 4 KB page size) |
| Compiler | Clang 20, C++20, -O3 -flto -march=native |
Transaction Throughput¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB'}}}}%%
xychart-beta
title "Transaction Throughput — Write-Only (transfers/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Transfers per second" 0 --> 280000
bar [163614, 159938, 167093, 104102, 239041]
| Engine | Transfers/sec | KV Ops/sec | Relative to PsiTri |
|---|---|---|---|
| PsiTri | 163,614 | 818,070 | 1.00x |
| PsiTriRocks | 159,938 | 799,690 | 0.98x |
| TidesDB | 167,093 | 835,465 | 1.02x |
| RocksDB | 104,102 | 520,510 | 0.64x |
| MDBX | 239,041 | 1,195,205 | 1.46x |
On this x86 cloud VM, MDBX's B+tree layout benefits from 4 KB OS pages and x86 hardware prefetchers. The gap is largely a page-copy artifact: MDBX copies one page per write, and 4 KB pages on x86 cost 4x less than the 16 KB pages on M5 Max (see cross-platform comparison).
Bulk Load¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#16A34A'}}}}%%
xychart-beta
title "Bulk Load — 1M Accounts (ops/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Operations per second" 0 --> 3500000
bar [2876212, 960000, 630000, 1160000, 2230000]
| Engine | Time | Ops/sec |
|---|---|---|
| PsiTri | 0.35s | 2.88M |
| MDBX | 0.45s | 2.23M |
| RocksDB | 0.86s | 1.16M |
| PsiTriRocks | 1.05s | 0.96M |
| TidesDB | 1.58s | 0.63M |
PsiTri leads bulk load on x86 — sequential arena writes benefit from AVX-512
copy_branches (9x speedup over scalar).
Transaction Time (Write-Only Phase)¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#DC2626'}}}}%%
xychart-beta
title "Transaction Phase Wall Time — Write-Only (seconds)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Seconds" 0 --> 110
bar [61.1, 62.5, 59.8, 96.1, 41.8]
| Engine | Time | vs. PsiTri |
|---|---|---|
| PsiTri | 61.1s | -- |
| PsiTriRocks | 62.5s | +2% |
| TidesDB | 59.8s | -2% |
| RocksDB | 96.1s | +57% |
| MDBX | 41.8s | -32% |
Validation Scan¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#9333EA'}}}}%%
xychart-beta
title "Validation Scan — 7.9M Entries (ops/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Operations per second" 0 --> 42000000
bar [19628559, 19300000, 1200000, 11200000, 38100000]
| Engine | Time | Ops/sec |
|---|---|---|
| MDBX | 0.38s | 38.1M |
| PsiTri | 0.73s | 19.6M |
| PsiTriRocks | 0.74s | 19.3M |
| RocksDB | 1.28s | 11.2M |
| TidesDB | 11.9s | 1.20M |
PsiTri and PsiTriRocks are nearly identical on scan, and both comfortably ahead of RocksDB and TidesDB.
Concurrent Read Performance¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB'}}}}%%
xychart-beta
title "Write Throughput: Write-Only vs Write+Read (tx/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Transfers per second" 0 --> 280000
bar [163614, 159938, 167093, 104102, 239041]
bar [181168, 121276, 163275, 87423, 229280]
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#9333EA'}}}}%%
xychart-beta
title "Concurrent Reader Throughput (reads/sec)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Reads per second" 0 --> 1200000
bar [1117771, 785868, 326596, 223149, 927734]
| Engine | Write-Only | Write+Read | Write Impact | Reader reads/sec |
|---|---|---|---|---|
| PsiTri | 163,614 | 181,168 | +10.7% | 1,117,771 |
| PsiTriRocks | 159,938 | 121,276 | -24.2% | 785,868 |
| TidesDB | 167,093 | 163,275 | -2.3% | 326,596 |
| RocksDB | 104,102 | 87,423 | -16.0% | 223,149 |
| MDBX | 239,041 | 229,280 | -4.1% | 927,734 |
PsiTri shows no write degradation from concurrent reads (+10.7%), consistent with its lock-free MVCC architecture. PsiTri delivers the highest reader throughput of any engine on this platform (1.12M reads/sec), ahead of MDBX (928K).
Storage¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB'}}}}%%
xychart-beta
title "Reachable Data Size (MB)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Megabytes" 0 --> 1400
bar [561, 558, 675, 567, 1257]
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#DC2626'}}}}%%
xychart-beta
title "On-Disk File Size (MB)"
x-axis ["PsiTri", "PsiTriRocks", "TidesDB", "RocksDB", "MDBX"]
y-axis "Megabytes" 0 --> 2500
bar [2080, 2298, 675, 579, 1344]
| Engine | Reachable | File Size |
|---|---|---|
| PsiTriRocks | 558 MB | 2,298 MB |
| PsiTri | 561 MB | 2,080 MB |
| RocksDB | 567 MB | 579 MB |
| TidesDB | 675 MB | 675 MB |
| MDBX | 1,257 MB | 1,344 MB |
PsiTri and PsiTriRocks have the most compact reachable data. The larger file size reflects COW free space awaiting compaction, consistent with the M5 Max results.
Commit Granularity¶
Batch size controls how many transfers are grouped into a single atomic commit. Larger batches amortize commit overhead and improve throughput, but at the cost of read-visibility latency: a concurrent reader cannot see any transfer in the batch until the whole batch commits.
Note: In PsiTri, uncommitted writes are buffered in the DWAL's in-memory ART map and backed by a WAL file for durability. Committing appends to the WAL buffer with no I/O; the background merge thread drains committed data to the COW trie asynchronously. In RocksDB, large batches mirror the memtable (up to ~64 MB for a 1M-transfer batch). TidesDB has a hard limit of 100K ops per transaction; at 5 ops per transfer that allows at most ~20K transfers per commit, so it is only valid at batch=100 and is excluded from the scaling charts.
Write-Only Throughput vs Batch Size¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#7C3AED,#DC2626,#D97706'}}}}%%
xychart-beta
title "Write-Only Throughput vs Batch Size (tx/sec)"
x-axis ["100", "10K", "100K", "1M"]
y-axis "Transfers per second" 0 --> 650000
line [163614, 254421, 481361, 593556]
line [159186, 212378, 274096, 360102]
line [242402, 286117, 475811, 603124]
line [108884, 159623, 144383, 185796]
Series order: PsiTri (blue), PsiTriRocks (purple), MDBX (red), RocksDB (amber)
| Engine | batch=100 | batch=10K | batch=100K | batch=1M | Peak gain |
|---|---|---|---|---|---|
| PsiTri | 163,614 | 254,421 | 481,361 | 593,556 | +263% |
| MDBX | 242,402 | 286,117 | 475,811 | 603,124 | +149% |
| PsiTriRocks | 159,186 | 212,378 | 274,096 | 360,102 | +126% |
| RocksDB | 108,884 | 159,623 | 144,383 | 185,796 | +71% |
| TidesDB | 180,654 | — | — | — | N/A |
PsiTri and MDBX both reach ~600K tx/sec at batch=1M and scale similarly — PsiTri benefits from amortizing the root-pointer CAS, while MDBX amortizes its B+tree page COW. RocksDB shows a dip at batch=100K before recovering at 1M, likely reflecting memtable pressure interacting with compaction.
Write+Read Throughput vs Batch Size¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#7C3AED,#DC2626,#D97706'}}}}%%
xychart-beta
title "Write+Read Throughput vs Batch Size (tx/sec)"
x-axis ["100", "10K", "100K", "1M"]
y-axis "Transfers per second" 0 --> 550000
line [181168, 272840, 468694, 509351]
line [117094, 177691, 230013, 361591]
line [263524, 292664, 444587, 490456]
line [119185, 138015, 144091, 181406]
Series order: PsiTri (blue), PsiTriRocks (purple), MDBX (red), RocksDB (amber)
| Engine | batch=100 | batch=10K | batch=100K | batch=1M |
|---|---|---|---|---|
| PsiTri | 181,168 | 272,840 | 468,694 | 509,351 |
| MDBX | 263,524 | 292,664 | 444,587 | 490,456 |
| PsiTriRocks | 117,094 | 177,691 | 230,013 | 361,591 |
| RocksDB | 119,185 | 138,015 | 144,091 | 181,406 |
| TidesDB | 178,478 | — | — | — |
PsiTri overtakes MDBX in write+read mode at batch=100K and beyond. MDBX's reader/writer lock becomes relatively more costly as the write thread holds the lock for longer batches, while PsiTri's lock-free MVCC is unaffected.
Concurrent Reader Throughput vs Batch Size¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#7C3AED,#DC2626,#D97706'}}}}%%
xychart-beta
title "Concurrent Reader Throughput vs Batch Size (reads/sec)"
x-axis ["100", "10K", "100K", "1M"]
y-axis "Reads per second" 0 --> 1800000
bar [1117771, 1278498, 1358428, 1356727]
bar [872316, 1095787, 1388806, 1669767]
bar [1055486, 1166392, 1306705, 1100886]
bar [275903, 324247, 347816, 296904]
Series order: PsiTri (blue), PsiTriRocks (purple), MDBX (red), RocksDB (amber)
| Engine | batch=100 | batch=10K | batch=100K | batch=1M |
|---|---|---|---|---|
| PsiTri | 1,117,771 | 1,278,498 | 1,358,428 | 1,356,727 |
| PsiTriRocks | 872,316 | 1,095,787 | 1,388,806 | 1,669,767 |
| MDBX | 1,055,486 | 1,166,392 | 1,306,705 | 1,100,886 |
| RocksDB | 275,903 | 324,247 | 347,816 | 296,904 |
PsiTri and PsiTriRocks reader throughput improves monotonically with batch size: larger batches mean the writer holds the current root longer, giving readers more time on a stable snapshot. MDBX reader throughput peaks at batch=100K and drops at batch=1M, where longer write transactions increase reader-visible stall events. RocksDB readers are consistently the slowest across all batch sizes.
Per-Round Throughput Curves¶
The write+read phase runs 10M transfers in 10 rounds of 1M each. The round-by-round breakdown reveals stability (or instability) in each engine's throughput profile.
| Round | PsiTri 1M | MDBX 100K | RocksDB 10K | PsiTriRocks 100K |
|---|---|---|---|---|
| 1 | 548K | 453K | 165K | 254K |
| 2 | 523K | 453K | 157K | 249K |
| 3 | 510K | 452K | 156K | 243K |
| 4 | 504K | 452K | 142K | 235K |
| 5 | 505K | 452K | 139K | 230K |
| 6 | 508K | 453K | 134K | 229K |
| 7 | 505K | 452K | 133K | 230K |
| 8 | 505K | 452K | 134K | 230K |
| 9 | 508K | 450K | 137K | 230K |
| 10 | 509K | 444K | 138K | 230K |
- PsiTri (batch=1M): Starts hot (548K) as the working set is warm, settles to a stable ~505–509K after round 3. No compaction stalls.
- MDBX (batch=100K): Essentially flat at 452–453K through round 9, with a small dip in round 10. Highly predictable.
- RocksDB (batch=10K): Falls from 165K to a 133–134K trough in rounds 6–8 — a classic compaction cliff as L0→L1 compaction competes with writes — then partially recovers to 138K.
- PsiTriRocks (batch=100K): Sharp drop from 254K to ~230K by round 5 (RocksDB compaction absorbing the write path through the shim), then fully stable through rounds 5–10.
MDBX File Bloat Under Large Batches¶
MDBX copies full pages on each write. Larger batches create more dirty pages per commit, and MDBX's free-page recycler can leave significant dead space in the file.
| Batch size | File size | Live data | Waste |
|---|---|---|---|
| 100 | 1,344 MB | 1,257 MB | 6% |
| 10K | 3,968 MB | 1,257 MB | 68% |
| 100K | 4,928 MB | 1,257 MB | 74% |
| 1M | 2,048 MB | 1,257 MB | 39% |
File bloat peaks at batch=100K (4,928 MB — nearly 4x live data) then shrinks at batch=1M as fewer, larger commits leave the free-page recycler more time per commit to consolidate space. PsiTri's file size is not meaningfully affected by batch size because its COW operates at 64-byte node granularity rather than 4 KB page granularity.
Results: AMD EPYC-Turin — Five Engines × Four Batch Sizes (April 2026)¶
This run adds PsiTri's DWAL mode (buffered writes via ART + WAL) alongside direct mode and sweeps four batch sizes on the same Vultr cloud VM used in the prior AMD section above. Workload parameters differ from the prior run (100K accounts, 2M transactions, no sync) so absolute numbers are not directly comparable to the single-batch results above; use these charts to compare engines and batch-size scaling relative to each other.
Environment¶
| Component | Spec |
|---|---|
| CPU | AMD EPYC-Turin, 8 cores / 16 threads (SMT) |
| RAM | 121 GB |
| OS | Ubuntu Linux 6.17.0-14-generic x86_64 (Vultr cloud VM) |
| Compiler | Clang 20, C++20, -O3 -flto -march=native |
| Kernel page size | 4 KB |
Workload¶
| Parameter | Value |
|---|---|
| Accounts | 100,000 |
| Transactions | 2,000,000 per phase |
| Batch sizes swept | 1 · 1,000 · 100,000 · 1,000,000 |
| Sync mode | none |
| Sync every | never |
| RNG seed | 12345 |
Write-Only Throughput vs Batch Size¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#16A34A,#7C3AED,#D97706,#DC2626'}}}}%%
xychart-beta
title "Write-Only Throughput vs Batch Size (tx/sec)"
x-axis ["1", "1K", "100K", "1M"]
y-axis "Transfers per second" 0 --> 1100000
line [101190, 416986, 1012173, 1064827]
line [783944, 993090, 775373, 820138]
line [727044, 803153, 624526, 856796]
line [251842, 301194, 292238, 461033]
line [525868, 482735, 341122, 344629]
Series: PsiTri direct (blue) · PsiTri DWAL (green) · PsiTriRocks (purple) · RocksDB (amber) · TidesDB (red)
| Engine | batch=1 | batch=1K | batch=100K | batch=1M | Peak |
|---|---|---|---|---|---|
| PsiTri | 101,190 | 416,986 | 1,012,173 | 1,064,827 | 1.06M |
| PsiTri (DWAL) | 783,944 | 993,090 | 775,373 | 820,138 | 993K |
| PsiTriRocks | 727,044 | 803,153 | 624,526 | 856,796 | 857K |
| TidesDB | 525,868 | 482,735 | 341,122 | 344,629 | 526K |
| RocksDB | 251,842 | 301,194 | 292,238 | 461,033 | 461K |
PsiTri in direct mode peaks at 1.06M tx/sec at large batches -- the COW cost is fully amortized across the batch. PsiTri in DWAL mode dominates at small batches (batch=1: 7.7x faster than direct mode), because the ART buffer absorbs per-commit COW overhead. Both modes converge near 820-1,065K at batch=1M, where batching already amortizes COW cost.
Write+Read Throughput vs Batch Size¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#16A34A,#7C3AED,#D97706,#DC2626'}}}}%%
xychart-beta
title "Write+Read Throughput vs Batch Size (tx/sec)"
x-axis ["1", "1K", "100K", "1M"]
y-axis "Transfers per second" 0 --> 1100000
line [114569, 343664, 1051946, 1074253]
line [733942, 863056, 698728, 826451]
line [560835, 712217, 562915, 902071]
line [280863, 231596, 266907, 467095]
line [491786, 481292, 333894, 219403]
Series: PsiTri direct (blue) · PsiTri DWAL (green) · PsiTriRocks (purple) · RocksDB (amber) · TidesDB (red)
| Engine | batch=1 | batch=1K | batch=100K | batch=1M | Write impact at 1M |
|---|---|---|---|---|---|
| PsiTri | 114,569 | 343,664 | 1,051,946 | 1,074,253 | +0.9% |
| PsiTri (DWAL) | 733,942 | 863,056 | 698,728 | 826,451 | +0.8% |
| PsiTriRocks | 560,835 | 712,217 | 562,915 | 902,071 | +5.3% |
| RocksDB | 280,863 | 231,596 | 266,907 | 467,095 | +1.3% |
| TidesDB† | 491,786 | 481,292 | 333,894 | 219,403 | −36.3% |
PsiTri write throughput is unaffected or slightly boosted by a concurrent reader across all batch sizes — consistent with its lock-free MVCC. TidesDB shows severe writer degradation at batch=1M (−36%), and fails correctness at batch ≥ 1K (see note below).
Concurrent Reader Throughput vs Batch Size¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#2563EB,#16A34A,#7C3AED,#D97706,#DC2626'}}}}%%
xychart-beta
title "Concurrent Reader Throughput vs Batch Size (reads/sec)"
x-axis ["1", "1K", "100K", "1M"]
y-axis "Reads per second" 0 --> 5000000
line [2618658, 1611067, 4691063, 4850451]
line [4100122, 3935602, 4218272, 4840387]
line [4334633, 4127036, 4316397, 4878738]
line [928308, 778775, 639254, 409677]
line [1561115, 1633270, 1379504, 937190]
Series: PsiTri direct (blue) · PsiTri DWAL (green) · PsiTriRocks (purple) · RocksDB (amber) · TidesDB (red)
| Engine | batch=1 | batch=1K | batch=100K | batch=1M |
|---|---|---|---|---|
| PsiTriRocks | 4,334,633 | 4,127,036 | 4,316,397 | 4,878,738 |
| PsiTri | 2,618,658 | 1,611,067 | 4,691,063 | 4,850,451 |
| PsiTri (DWAL) | 4,100,122 | 3,935,602 | 4,218,272 | 4,840,387 |
| TidesDB | 1,561,115 | 1,633,270 | 1,379,504 | 937,190 |
| RocksDB | 928,308 | 778,775 | 639,254 | 409,677 |
All three psitri-family engines deliver 4–4.9M reads/sec at large batches. PsiTri reader throughput scales steeply with batch size (2.6M→4.9M) — the writer holds the current root longer at larger batches, giving readers more time on a stable, hot snapshot. PsiTri DWAL reader throughput is broadly flat (~4M) across batch sizes. RocksDB reader throughput falls with batch size, likely as write amplification from larger memtables crowds out read I/O.
Correctness¶
| Engine | batch=1 | batch=1K | batch=100K | batch=1M |
|---|---|---|---|---|
| PsiTri | PASS | PASS | PASS | PASS |
| PsiTri (DWAL) | PASS | PASS | PASS | PASS |
| PsiTriRocks | PASS | PASS | PASS | PASS |
| RocksDB | PASS | PASS | PASS | PASS |
| TidesDB | PASS | FAIL | FAIL | FAIL |
†TidesDB balance conservation fails at batch ≥ 1K under concurrent read load (observed balance exceeds expected by 0.008–32% depending on batch size), indicating a transaction isolation bug when large write batches overlap with concurrent readers. Only the batch=1 result is valid for TidesDB.
Cross-Platform Comparison¶
The most striking cross-platform shift is MDBX, not PsiTri.
Write Throughput: ARM M5 Max vs x86 EPYC-Turin¶
| Engine | M5 Max (ARM64) | EPYC-Turin (x86) | Change |
|---|---|---|---|
| PsiTri | 376,691 | 163,614 | -57% |
| PsiTriRocks | 356,703 | 159,938 | -55% |
| TidesDB | 225,937 | 167,093 | -26% |
| RocksDB | 126,341 | 104,102 | -18% |
| MDBX | 57,138 | 239,041 | +318% |
Every engine is slower on the x86 VM than on M5 Max — this is a cloud VM vs a high-end workstation, so absolute numbers aren't directly comparable. What matters is the relative ordering and the magnitude of MDBX's swing.
Why does MDBX improve so much on x86?
MDBX (like LMDB) uses page-level copy-on-write: every write copies the entire page containing the modified key. On M5 Max the OS page size is 16 KB; on x86 Linux it is 4 KB. Each write copies 4x less data on x86, which maps almost exactly onto the 4.2x throughput increase. This is an OS page size effect, not an architectural one.
PsiTri's COW operates at 64-byte node granularity and is indifferent to OS page size, so it does not benefit from the same effect. Its absolute throughput drops on the cloud VM due to higher memory access latency compared to M5 Max's Unified Memory Architecture, but its relative standing against RocksDB and TidesDB stays consistent across both platforms.
Consistent patterns across both platforms:
- PsiTri leads or ties for bulk load on both platforms
- PsiTri and PsiTriRocks lead reachable data compactness on both platforms
- MDBX leads sequential scan on both platforms
- PsiTri and MDBX lead concurrent reader throughput on both platforms
- RocksDB is consistently mid-pack on transactions
Reproducing¶
# Build all engines (from repo root)
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release \
-DBUILD_ROCKSDB_BENCH=ON \
-DBUILD_TIDESDB_BENCH=ON \
-B build/release
cmake --build build/release -j8 --target \
bank-bench-psitri \
bank-bench-psitrirocks \
bank-bench-rocksdb \
bank-bench-mdbx \
bank-bench-tidesdb
# Run each engine with identical parameters
for engine in psitri psitrirocks rocksdb mdbx tidesdb; do
build/release/bin/bank-bench-${engine} \
--num-accounts=1000000 \
--num-transactions=10000000 \
--batch-size=100 \
--sync-every=100 \
--db-path=/tmp/bb_${engine}
done
CLI Options¶
| Flag | Default | Description |
|---|---|---|
--num-accounts |
1,000,000 | Number of bank accounts |
--num-transactions |
10,000,000 | Number of transfer attempts |
--batch-size |
1 | Transfers per commit |
--sync-every |
0 | Sync to disk every N commits (0 = never) |
--sync-mode |
none | Durability: none, async, sync |
--seed |
12345 | RNG seed for reproducibility |
--db-path |
/tmp/bank_bench_db | Database directory |
--initial-balance |
1,000,000 | Starting balance per account |
--reads-per-tx |
100 | Point reads per reader thread batch |
Environment¶
Apple M5 Max (ARM64)¶
- Hardware: Apple M5 Max, 128 GB Unified Memory
- OS: macOS (Darwin 25.3.0)
- Compiler: Clang 17 (LLVM), C++20,
-O3 -flto=thin - Engine versions: RocksDB 9.9.3, libmdbx 0.13.11, TidesDB 8.9.4
AMD EPYC-Turin (x86-64)¶
- Hardware: AMD EPYC-Turin, 8 cores / 16 threads, 2.4 GHz, 121 GB RAM (cloud VM — Vultr)
- ISA: AVX-512F/BW/DQ/VL/VBMI/VBMI2/VNNI, AVX2, SSSE3, SSE4.1
- OS: Ubuntu Linux 6.17.0-14-generic x86_64, 4 KB page size
- Compiler: Clang 20 (LLVM), C++20,
-O3 -flto -march=native - Engine versions: RocksDB (built from source), TidesDB (built from source)
- April 2026 batch-scaling run: 5 engines + PsiTri DWAL mode; 100K accounts, 2M tx, batch sizes 1/1K/100K/1M, sync=none
- April 2026 full comparison: 6 engines including SQLite; 1M accounts, 1M tx, batch=1, sync=none
Full Engine Comparison (April 2026, AMD EPYC-Turin)¶
1M accounts, 1M transactions, batch=1 (per-transfer commit), sync=none. Each transfer = 5 KV operations (2 reads + 2 updates + 1 insert).
Write-Only Throughput¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#7b1fa2'}}}}%%
xychart-beta
title "Bank Benchmark — Write-Only (KV ops/sec, 1M accounts)"
x-axis ["DWAL", "PsiTriRocks", "RocksDB", "PsiTri-SQLite", "System SQLite"]
y-axis "KV operations per second" 0 --> 2500000
bar [2138670, 1847230, 787940, 264175, 248690]
| Engine | API | Transfers/sec | KV Ops/sec | vs RocksDB |
|---|---|---|---|---|
| PsiTri (DWAL) | Native | 427,734 | 2,138,670 | 2.7x |
| PsiTriRocks | RocksDB | 369,446 | 1,847,230 | 2.3x |
| RocksDB | RocksDB | 157,588 | 787,940 | 1.0x |
| PsiTri-SQLite | SQLite | 52,835 | 264,175 | — |
| System SQLite | SQLite | 49,738 | 248,690 | — |
Concurrent Read + Write¶
%%{init: {'theme': 'base', 'themeVariables': {'xyChart': {'plotColorPalette': '#7b1fa2'}}}}%%
xychart-beta
title "Bank Benchmark — Write + Concurrent Reader (KV ops/sec)"
x-axis ["DWAL", "PsiTriRocks", "RocksDB", "PsiTri-SQLite", "System SQLite"]
y-axis "KV operations per second" 0 --> 2000000
bar [1746370, 1748115, 573760, 262740, 77485]
| Engine | API | Write+Read tx/sec | Write Impact | Reader reads/sec |
|---|---|---|---|---|
| PsiTri (DWAL) | Native | 349,274 | -18% | 1,447,979 |
| PsiTriRocks | RocksDB | 349,623 | -5% | 1,416,652 |
| RocksDB | RocksDB | 114,752 | -27% | 289,627 |
| PsiTri-SQLite | SQLite | 52,548 | -0.5% | 215,104 |
| System SQLite | SQLite | 15,497 | -69% | 15,917 |
Drop-In Replacement Speedups¶
Same API, same workload — only the storage engine changes:
| Comparison | Write Speedup | Read Speedup | Write Impact |
|---|---|---|---|
| PsiTriRocks vs RocksDB | 2.3x | 4.9x | -5% vs -27% |
| PsiTri-SQLite vs System SQLite | 1.06x | 13.5x | -0.5% vs -69% |
PsiTri-SQLite's -0.5% write impact under concurrent reads means readers are effectively invisible to the writer — compared to System SQLite's -69% degradation where the reader blocks the writer during WAL checkpointing.