AI

Practical AI experimentation, local LLM setup, benchmarks, and Claude Code

AI

Why Your llama.cpp Benchmarks Are Wrong: GPU Architecture and Real Numbers
ByIan L. Paterson June 6, 2026June 7, 2026

I pulled an aging Quadro out of my homelab LLM box, dropped in an RTX 2060 SUPER, and the thing booted on…

Read More Why Your llama.cpp Benchmarks Are Wrong: GPU Architecture and Real Numbers
AI

I Built a Honeypot to Catch Prompt Injections in Claude Code (Here’s What It Caught)
ByIan L. Paterson June 3, 2026June 7, 2026

I built a honeypot canary that screens web content with a deliberately gullible LLM before my AI agent reads it. Here is what it caught.

Read More I Built a Honeypot to Catch Prompt Injections in Claude Code (Here’s What It Caught)
AI

Free LLM API Tiers in 2026: What Groq, Cerebras, Mistral, Gemini and Cohere Actually Give You
ByIan L. Paterson May 31, 2026June 7, 2026

On May 31, 2026, one of my LLM providers quietly deleted most of its free models, including the exact one my code…

Read More Free LLM API Tiers in 2026: What Groq, Cerebras, Mistral, Gemini and Cohere Actually Give You
AI

Three Green Lies: Debugging a Self-Hosted LLM Observability Dashboard
ByIan L. Paterson May 27, 2026June 7, 2026

Within a week of standing up the dashboard, three panels were showing bad data. None of them was a Grafana bug. I…

Read More Three Green Lies: Debugging a Self-Hosted LLM Observability Dashboard
AI

How I Drive WordPress From Claude Code (REST, Playwright, wp-cli)
ByIan L. Paterson May 20, 2026June 7, 2026

Driving WordPress from Claude Code (Anthropic’s terminal coding agent) fights the platform’s core assumption roughly every other operation. WordPress was built for…

Read More How I Drive WordPress From Claude Code (REST, Playwright, wp-cli)
AI

Sorting a Filesystem Hoard With Local LLMs: What 2,300 Files Told Me About My Obsidian Vault
ByIan L. Paterson May 16, 2026June 7, 2026

What do 2,300 random files on a knowledge worker’s laptop actually look like? I let a local LLM tell me. TL;DR I…

Read More Sorting a Filesystem Hoard With Local LLMs: What 2,300 Files Told Me About My Obsidian Vault
AI

Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive → DFlash → MTP for Qwen3.6-27B
ByIan L. Paterson May 10, 2026June 7, 2026

The setup The starting line was 43 tokens per second decode on vanilla llama.cpp. The finishing line, three months later, is 39…

Read More Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive → DFlash → MTP for Qwen3.6-27B
AI

Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)
ByIan L. Paterson May 7, 2026June 7, 2026

I pulled a Quadro M4000 out of a used Dell Precision T5820, dropped in an RTX 3090 Ti, and turned the box…

Read More Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)
AI

The LLM Kept Saying “Fixed.” For Three Months, It Wasn’t.
ByIan L. Paterson April 12, 2026April 12, 2026

That afternoon a Slack bot told me a script had NEVER RUN. That was a lie. The script had pulled 81 weather…

Read More The LLM Kept Saying “Fixed.” For Three Months, It Wasn’t.
AI

Stop Claude Code from Lobotomizing Itself Mid-Task
ByIan L. Paterson March 25, 2026June 7, 2026

Claude Code has a feature called auto-compact that quietly destroys your session quality. The Problem I was three hours into a multi-file…

Read More Stop Claude Code from Lobotomizing Itself Mid-Task

How a CEO uses Claude Code and Hermes to do the knowledge work Done. Check your inbox.