How I Track Claude, Codex, and Gemini Quotas from One Script
Updated May 30: added what I learned wiring these three together, plus the budget thresholds that now trigger automation. (If you’re trying…
Practical AI experimentation, local LLM setup, benchmarks, and Claude Code
Updated May 30: added what I learned wiring these three together, plus the budget thresholds that now trigger automation. (If you’re trying…
Inference arbitrage means routing each AI task to the cheapest model that can handle it at acceptable quality, instead of sending everything…
Most LLM benchmarks measure raw intelligence. Real deployment decisions also depend on latency, format reliability, and data boundaries, including when a task…
I spent about two weeks of evenings getting Qwen3-Coder-30B running reliably on a Mac Studio (M1 Max, 32GB) through LM Studio and…
The full architecture for giving Claude Code persistent memory across sessions: four layers of markdown files, two commands, five cron jobs, and the 8 design rules I derived from breaking it over 22 days.
OpenClaw on Apple Silicon with a 24B local model: 14 real errors fixed, sub-agent delivery working, $1.50/month total. Every config documented.