Agentic CPU Bottleneck Bench

A vendor-neutral benchmark harness that makes the “CPU bottleneck in agentic AI” measurable.

It simulates multi-agent loops with configurable mixes of:

planning / routing overhead (CPU-bound string + JSON work)
tool-call orchestration (RPC-like waits + serialization)
memory lookups (KV-style access patterns)
I/O-like behavior (disk + sqlite mocks)

It produces a compact table with p50/p95 step latency, CPU utilization proxy, and orchestration vs work breakdown.

This is not a model benchmark. It is a control-plane / agent-runtime benchmark.

Quickstart (Windows PowerShell)

python -m venv .venv
.\.venv\Scripts\python.exe -m pip install -r requirements.txt
.\.venv\Scripts\python.exe .\bench.py

Run a single scenario:

.\.venv\Scripts\python.exe .\run_bench.py --workload workloads\mixed.json --agents 32 --steps 200

Sample outputs

See docs/Output.md for real benchmark output tables and fan-out runs.

What you get

The benchmark prints a table like:

p50_ms, p95_ms — step latency distribution
cpu_util — process CPU time / wall time (rough proxy for CPU saturation)
cpu_ms — CPU time spent per step
io_wait_ms — simulated tool/I/O wait per step
json_kb, parse_ms — serialization + parsing costs
optional: context switches if psutil is available

Workloads

workloads/reasoning_heavy.json — few tool calls, more CPU planning + parsing
workloads/action_heavy.json — frequent tool calls and higher orchestration overhead
workloads/mixed.json — typical “agent” blend

You can add your own JSON workload; see workloads/schema.json.

Interpreting results

Common patterns:

High p95 with low cpu_util → coordination / waits dominate (tool calls, I/O, blocking)
High cpu_util with rising p50/p95 → CPU saturated (planning/parsing dominates)
Large parse_ms / json_kb → serialization overhead is a real tax; consider batching / binary codecs
Many agents increases tail → lock contention + event loop overhead (agent fan-out)

See: docs/INTERPRETING_OUTPUT.md

Repo layout

bench.py — one-command benchmark table
run_bench.py — run a single workload with CLI flags
agentic_bench/ — simulator + metrics
workloads/ — workload profiles (JSON)
tools/ — optional helpers
.github/workflows/test.yml — CI sanity run

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic CPU Bottleneck Bench

Quickstart (Windows PowerShell)

Sample outputs

What you get

Workloads

Interpreting results

Repo layout

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
agentic_bench		agentic_bench
docs		docs
tools		tools
workloads		workloads
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bench.py		bench.py
requirements.txt		requirements.txt
run_bench.py		run_bench.py

Folders and files

Latest commit

History

Repository files navigation

Agentic CPU Bottleneck Bench

Quickstart (Windows PowerShell)

Sample outputs

What you get

Workloads

Interpreting results

Repo layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages