Measured, not estimated.
Every token-cost claim on visitportal.dev is produced by Anthropic'scount_tokensAPI. If the measurement disagrees with the pitch, we update the pitch. Never the other way.
Canonical run — tokens-matrix-v1
48 cells · 48 ok · seed 42 · mode count_tokens_only
Started 2026-04-19T08:05:10.635Z, finished 2026-04-19T08:05:29.466Z. Full raw JSON: packages/bench/results/tokens-matrix-v1.json.
Summary
Median input tokens per turn, by tool count, across the matrix:
| Tool count | MCP (median input tokens) | Portal | MCP : Portal |
|---|---|---|---|
| 10 | 1,956 | 172 | 11.4× |
| 50 | 7,343 | 172 | 42.7× |
| 100 | 13,929 | 172 | 81.0× |
| 400 | 54,677 | 172 | 317.9× |
▸ MCP scales linearly at ~137 tokens per preloaded tool in this simulation. Portal stays flat — the manifest is loaded on visit, not preloaded into every turn. Tokenizer parity across Sonnet 4.5 and Opus 4.5 confirmed (byte-identical counts for the same prompt + tool list).
Chart
Reproduce it
export ANTHROPIC_API_KEY=sk-ant-...
pnpm install
BENCH_MODE=count_tokens_only pnpm --filter @visitportal/bench bench
# 48 cells against Anthropic's count_tokens API in ~20s, ~$0.10 totalThe bench harness is in packages/bench/. Scenarios live in packages/bench/src/harness/bench.ts; the MCP tool-schema simulator is packages/bench/src/mcp-simulator.ts; the tasks we measure against are in packages/bench/src/tasks/definitions.ts.
Methodology — what we can and can't claim
The simulator generates plausible MCP tool schemas across seven domains (filesystem, github, search, database, http, communication, knowledge), derived from seed tools scraped from the modelcontextprotocol/servers repo. Mean description length ~112 chars; every tool has 1–6 params.
Can claim: for a plausibly-shaped multi-server MCP deployment of N tools, preloaded schema consumes X tokens per turn on Sonnet 4.5 / Opus 4.5, measured by count_tokens. Determinism: same seed → byte-identical tools → byte-identical token counts.
Cannot claim: that every specific real-world deployment is exactly this shape. Real MCP sometimes emits deeply nested JSON Schema ($ref, oneOf, allOf) which we skip — so our MCP number is a conservative lower bound. Full disclosure in packages/bench/METHODOLOGY.md.