AI

Practical AI experimentation, local LLM setup, benchmarks, and Claude Code

AI | Homelab

Three Green Lies: Debugging a Self-Hosted LLM Observability Dashboard
ByIan L. Paterson May 27, 2026July 12, 2026

Within a week of standing up the dashboard, three panels were showing bad data. None of them was a Grafana bug. I…

Read More Three Green Lies: Debugging a Self-Hosted LLM Observability Dashboard
AI

How I Drive WordPress From Claude Code (REST, Playwright, wp-cli)
ByIan L. Paterson May 20, 2026July 12, 2026

Driving WordPress from Claude Code (Anthropic’s terminal coding agent) fights the platform’s core assumption roughly every other operation. WordPress was built for…

Read More How I Drive WordPress From Claude Code (REST, Playwright, wp-cli)
AI

Sorting a Filesystem Hoard With Local LLMs: What 2,300 Files Told Me About My Obsidian Vault
ByIan L. Paterson May 16, 2026July 12, 2026

What do 2,300 random files on a knowledge worker’s laptop actually look like? I let a local LLM tell me. TL;DR I…

Read More Sorting a Filesystem Hoard With Local LLMs: What 2,300 Files Told Me About My Obsidian Vault
AI

Anti-detect browser benchmark 2026: 7 stealth tools, 31 Cloudflare targets, 651 verdicts
ByIan L. Paterson May 13, 2026July 12, 2026

I built a scraper. Cloudflare killed it in 48 hours. I built a web scraper for Canadian small-cap stock data and Cloudflare…

Read More Anti-detect browser benchmark 2026: 7 stealth tools, 31 Cloudflare targets, 651 verdicts
AI

Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive → DFlash → MTP for Qwen3.6-27B
ByIan L. Paterson May 10, 2026July 12, 2026

TL;DR MTP wins on wall clock above output ~900 tokens. Below that, plain autoregressive is faster. The DFlash Decode Collapse. DFlash decode…

Read More Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive → DFlash → MTP for Qwen3.6-27B
AI

Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)
ByIan L. Paterson May 7, 2026July 12, 2026

I pulled a Quadro M4000 out of a used Dell Precision T5820, dropped in an RTX 3090 Ti, and turned the box…

Read More Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)
AI

The LLM Kept Saying “Fixed.” For Three Months, It Wasn’t.
ByIan L. Paterson April 12, 2026July 12, 2026

That afternoon a Slack bot told me a script had NEVER RUN. That was a lie. The script had pulled 81 weather…

Read More The LLM Kept Saying “Fixed.” For Three Months, It Wasn’t.
AI

Stop Claude Code from Lobotomizing Itself Mid-Task
ByIan L. Paterson March 25, 2026July 12, 2026

Claude Code has a feature called auto-compact that quietly destroys your session quality. The Problem I was three hours into a multi-file…

Read More Stop Claude Code from Lobotomizing Itself Mid-Task
AI

How I Track Claude, Codex, and Gemini Quotas from One Script
ByIan L. Paterson March 19, 2026July 12, 2026

Updated May 30: added what I learned wiring these three together, plus the budget thresholds that now trigger automation. (If you’re trying…

Read More How I Track Claude, Codex, and Gemini Quotas from One Script
AI

Inference Arbitrage: How I Route 200+ Daily LLM Calls Across Five Models
ByIan L. Paterson March 14, 2026July 12, 2026

Inference arbitrage means routing each AI task to the cheapest model that can handle it at acceptable quality, instead of sending everything…

Read More Inference Arbitrage: How I Route 200+ Daily LLM Calls Across Five Models

How a CEO uses Claude Code and Hermes to do the knowledge work Done. Check your inbox.