Metrics: peak, allocated, allocations¶
One memray pass yields more than a single number. Every measured benchmark stores
three memory metrics, and compare / plot / the readers take any of them — peak
is the default, but allocated and allocations often catch what peak hides.
| Metric | Blob key | What it is | Reach for it when |
|---|---|---|---|
peak |
peak_bytes |
high-water of live bytes — the most allocated at once | headline footprint; "how big did it get" |
allocated |
total_bytes |
sum of every allocation over the run | churn / temporary spikes peak smooths over |
allocations |
allocations |
count of allocation calls | a near-deterministic, low-noise tripwire |
All three are memray's allocator demand — what your code requested, in-process and byte-exact, so they see native (numpy / C-extension) allocations, not just Python objects.
Distribution across repeats¶
With --benchmark-memory-repeats=N (suite-wide) or @pytest.mark.benchmem(repeats=N)
(per test), every repeat is kept as a flat series in the blob. The headline peak is
the minimum (the cleanest floor); ask for any other stat over the series with --stat:
benchmem compare base.json head.json --metric peak --stat stddev # how noisy is peak?
benchmem compare v1.json v2.json --metric allocated --stat mean
--stat takes min / max / mean / median / stddev and applies to any metric.
Peak is the noisy one (GC timing, page cache); stddev tells you how much.
The terminal table shows the spread too: with repeats > 1, every shown metric expands
into min / mean / max columns (peak·min, peak·mean, peak·max) — always, so
the columns don't shift between runs; a single pass stays one column. The table shows
peak only by default; add the rest with --benchmark-memory-columns=peak,allocated,allocs
and pick the spread stats with --benchmark-memory-stats=min,stddev.
Setup¶
import os
import sys
import tempfile
from pathlib import Path
os.environ["FORCE_COLOR"] = "1"
os.environ["PATH"] = f"{Path(sys.executable).parent}{os.pathsep}{os.environ['PATH']}"
_tmp = Path(tempfile.mkdtemp(prefix="pytest-benchmem-"))
Three readings of one run¶
A workload that allocates a lot of temporary memory but holds little at its peak —
the place peak and allocated diverge most:
suite = _tmp / "test_churn.py"
suite.write_text("""
def test_churn(benchmark_memory):
def work():
total = 0
for _ in range(200):
total += sum([i * i for i in range(20_000)])
return total
benchmark_memory(work)
""")
run = _tmp / "churn.json"
!pytest {suite} --benchmark-only --benchmark-json={run} --benchmark-columns=min,median -q -p no:cacheprovider
. [100%]
Wrote benchmark data in: <_io.BufferedWriter name='/tmp/pytest-benchmem-gzgmv4qt/churn.json'>
benchmark: 1 tests
Name (time in ms) Min Median │ peak (MiB)
──────────────────────────────────────────────────────────
test_churn 203.2353 204.2025 │ 1.16
memory (right of │): a separate, untimed pass, not the timed
rounds • also available via --benchmark-memory-columns:
allocated, allocs
1 passed in 3.87s
The same run read three ways — peak stays small (one list lives at a time) while
allocated is far larger (every list summed) and allocations counts the calls:
from pytest_benchmem import human_bytes, load_long_df
for metric in ("peak", "allocated", "allocations"):
df, unit = load_long_df([run], metric=metric)
v = df["value"].iloc[0]
shown = human_bytes(v) if unit == "B" else f"{v:.0f}"
print(f"{metric:<12} {shown}")
peak 1.16 MiB allocated 294 MiB allocations 9001
The raw blob, for reference:
import json
json.loads(run.read_text())["benchmarks"][0]["extra_info"]["benchmem"]
{'peak_bytes': [1221536], 'allocations': [9001], 'total_bytes': [308357376]}
Picking one for a gate¶
For CI gating, allocations is often the best tripwire — it's near-deterministic, so
a change there is almost always a real change in behaviour, not measurement noise.
peak answers the capacity question; allocated catches churn regressions a peak
gate would miss. You can gate on several at once — see
Compare & plot and the
reference.