How I correlate two async data streams to compute per-tool-call token costs in Claude Code (FOSS)

Posted by shade175@reddit | Python | View on Reddit | 2 comments

Built an open-source CLI tool that solves a specific problem: Claude Code doesn't tell you

which individual tool call consumed your tokens. CAT fills that gap.

Tech stack:

- FastAPI + Uvicorn async collector (receives hook events from Claude Code)

- SQLite + aiosqlite with schema migrations (WAL mode for concurrent reads)

- Delta engine that correlates hook events with statusline snapshots via session ID + timestamps

- Welford's online algorithm for O(1) rolling baseline statistics per task type

- Z-score anomaly detection over a 20-sample window

- Optional Haiku LLM classifier for root-cause analysis

- Rich TUI dashboard + Typer CLI

Why not just parse the Claude Code logs?

Claude Code hooks don't include token counts — only the statusline hook provides them.

The delta engine matches the two data streams by timestamp to compute per-call costs.

Currently at v0.3.1, CI passing on Ubuntu/macOS/Windows across Python 3.11–3.13.

Active development, good-first-issues available.

GitHub: https://github.com/roeimichael/ContextAnalyzerTerminal

[-]

Python-ModTeam@reddit

Hello there,

We've removed your post since it aligns with our monthly Project Showcase thread. Please share your new project or tool there.

Best regards,

r/Python mod team

Concurrent tool calls within the same turn can create timing noise in the statusline correlation — two overlapping tool events can share the same snapshot window and look like one. Worth tagging those as a 'concurrent pair' in the baseline before Z-scores run, otherwise the anomaly detector fires on normal parallel tool use.

The Haiku classifier for root-cause is exactly the right tradeoff — fractions of a cent to explain why a single turn cost $0.50 beats staring at raw deltas.