My openclawd API bill was 3x what it should be. so I built claw-compactor.
- Source: https://x.com/nielsen777brian/status/2021301480079389144?s=46
- Mirror: https://x.com/nielsen777brian/status/2021301480079389144?s=46
- Published: 2026-02-10T19:13:21+00:00
- Saved: 2026-02-11
Content

everyone's talking about token costs.
every Claude Code thread, every openclawd setup guide, every "I built X in 2 days" article — buried somewhere in the replies is the same question:
"how much did that cost you?"
I kept seeing that question. so I built something about it.
claw-compactor is a Python tool I wrote that deterministically compresses your workspace memory, session transcripts, and agent context. no LLM calls. no API costs. just rules.
it takes 10 minutes to set up. here's what it does and why I built it this way.
why I built it
I was running openclawd on a mid-size codebase. my workspace memory files kept growing — session logs, CLAUDE.md, observation notes. one day I checked: 180K tokens of context, and at least half of it was redundant formatting, duplicate content, and verbose session transcripts.
I didn't need a smarter model. I needed a compressor.
so I built one. 5 compression layers, each stacking on the last:
Layer 1 — Rule Engine (4-8% savings)
Deduplicates content, cleans up markdown formatting, merges redundant sections. the kind of stuff you'd do manually if you had the patience.
Layer 2 — Dictionary Encoding (4-5% savings)
Auto-learns a codebook from your workspace. repeated phrases get replaced with short $XX tokens. fully reversible.
Layer 3 — Observation Compression (~97% savings on session files)
this is the one I'm most proud of. your JSONL session transcripts — which are massive — get compressed into structured summaries. a 50,000 token session log becomes ~1,500 tokens of facts and decisions.
Layer 4 — RLE Patterns (1-2% savings)
file paths, IP addresses, repeated enums get shorthand notation. small but it compounds.
Layer 5 — Compressed Context Protocol (20-60% savings)
abbreviation levels that trade verbosity for density. partial lossy — facts stay, filler goes.
10-minute setup
step 1: clone it
step 2: benchmark (non-destructive — just shows what you'd save)
step 3: look at the numbers. if you like them, run full compression:
that's it. Python 3.9+, no dependencies required. optional tiktoken for precise token counts.
what the numbers actually look like
First-time verbose workspace
50-70% savings. unoptimized CLAUDE.md, raw logs. this is where most people start.
Session transcripts (JSONL)
~97% savings. this is not a typo. a 50K token log becomes ~1.5K tokens.
Regular maintenance (weekly)
10-20% savings. diminishing returns, still worth running.
Already-optimized workspace
3-12% savings. you've already done the easy wins.
the session transcript number is the headline. if you're running openclawd or claude code agents that accumulate session logs, Layer 3 alone justifies the install.
the stacking trick I keep telling people
claw-compactor + prompt caching = ~95% effective cost reduction.
here's the math:
claw-compactor compresses your context by 50%
prompt caching (cacheRetention: "long") gives 90% off cached tokens
50% compression x 90% cache discount = you're paying 5% of original cost
that's not theoretical. that's what I see on my own workspaces when I combine deterministic compression with model-level caching.
why I'm sharing this now
the openclawd ecosystem is exploding. people are building iOS apps in 5 days with Claude Code, running full agent teams, creating SaaS replacements.
but nobody's optimizing what goes into the context window. they're building bigger and bigger workspaces, accumulating more session history, and wondering why their token costs keep climbing.
I built claw-compactor to solve my own problem. turns out a lot of people have the same one.
bookmark this
you'll need it when your agent workspace hits 200K tokens and you're wondering where the money went.
https://github.com/aeromomo/claw-compactor

Link: http://x.com/i/article/2021291439661936640