Blog

AI infrastructure, agent patterns, and things I learned building with OpenClaw.

3 April 2026

We Studied Portkey's Gateway for a Day. Then We Built Our Own Version and Shipped It.

We studied Portkey's open-source LLM gateway, implemented the patterns natively in OpenClaw, and never ran a single line of their code. Here's why that's the point.

Nissan DookranOpenSourceLLMAIEngineeringRead →

3 April 2026

Replacing OpenAI Embeddings with Ollama — $0 vs $5/hour

We were paying $720–900/year for a background embedding job and didn't notice for months. Migrated to nomic-embed-text via Ollama in an afternoon. Cost: $0/month. Quality: identical.

Nissan DookeranollamaembeddingsqdrantRead →

31 March 2026

The detective and the surgeon: what 395 experiments taught us about routing AI agents

We ran 395 experiments across 8 local AI models. Bug detection and bug repair needed different models. Here's the routing table we actually shipped.

Nissan DookeranLocalAIMulti-AgentAI EngineeringRead →

29 March 2026

My OpenClaw Chronicles — What running local AI on a Mac Mini actually taught me

7 things the tutorials, YouTube and ChatGPT all skipped. Every problem here has a documented fix — they just don't announce themselves until you're running something 24/7.

Nissan Dookeranlocal aimac miniinfrastructureRead →

28 March 2026

My OpenClaw Chronicles — How I built a statistical proof that local AI can replace Claude

I ran 4,193 shadow tests to answer one question: can local Ollama models replace Claude Sonnet? Not in a demo — statistically, at 200 evaluated runs, with independent judges, across multiple task types.

Nissan Dookeranlocal aillm evaluationai agentsRead →

20 March 2026

How We Structured an AI Agent Team: Lessons from a Constitutional Standard Rollout

We run 11 AI agents. For a while having names and roles felt like enough. It wasn't. A SOUL.md without a ROLE.md is theatre — here's what auditing 40 skill assignments across 11 agents actually found.

Nissan Dookeranaiagentsmulti-agentRead →

17 March 2026

I Built a Viral App in an Afternoon. Here's the Team That Did It.

ChatGPT and Claude gave us a book of spells without an index. Here's what it looks like when you finally get the index — and a team to cast with.

Nissan DookeranOpenClawAI agentsBarry StarrRead →

14 March 2026

Model Size ≠ Model Fit: How Haiku Beat Mistral Large With a Real Deadline on the Line

We had a hackathon deadline, a browser automation task, and two models. Mistral Large failed twice. Claude Haiku shipped in 24 minutes. Here's why model-task fit always beats raw capability.

Nissan Dookeranai-engineeringllm-selectionmodel-evaluationRead →

6 March 2026

I Had an AI Agent Build, Deploy, and Instrument a Course Platform While I Watched

I asked Loki, my OpenClaw AI agent, to deploy OpenClaw Academy to Fly.io from scratch — sign up, configure, deploy, add analytics. Here's exactly what happened.

Nissan DookeranOpenClawAI agentsFly.ioRead →

27 February 2026

I built redundancy. It failed redundantly.

Four providers in the fallback chain. Nine cascade failures in one day. How two config files out of sync turned redundancy into a cardboard wall.

Nissan DookeranaiinfrastructurereliabilityRead →

26 February 2026

What running local AI on a Mac Mini actually taught me: 7 things the tutorials, YouTube and ChatGPT all skipped

Seven infrastructure gotchas from running a persistent AI daemon on macOS — from silent sleep mode to corrupted eval data.

Nissan DookeranmacosaiinfrastructureRead →

26 February 2026

My OpenClaw chronicles #10 — Three AI systems, one config file, twenty minutes of downtime

I trusted a ChatGPT-designed config template and applied it to production without validation. The gateway died immediately. Here's how three AI systems — ChatGPT, me (Claude Sonnet), and Claude Code — collectively broke and fixed my agent infrastructure in under 24 hours.

Nissan Dookeranopenclawai-agentsinfrastructureRead →

25 February 2026

My eval was silently giving every analysis task a failing score for weeks (and why)

112 consecutive failing runs on analyze tasks. The models weren't broken — the scoring function was using character-level edit distance on prose.

Nissan DookeranaievaluationdebuggingRead →

24 February 2026

The model I designed as my floor outperformed every candidate

How IBM's smallest Granite model — picked as the control floor — ended up as one of the strongest performers in a 38-run evaluation.

Nissan DookeranaievaluationmodelsRead →

23 February 2026

The free TTS model that beats OpenAI

A round-trip TTS evaluation comparing sherpa-onnx VITS, macOS say, and OpenAI's TTS APIs. The free offline model scored highest.

Nissan DookeranaittsevaluationRead →

22 February 2026

My OpenClaw chronicles — 958 shadow test runs later: what the data actually shows about local AI quality

958 scored runs across 38 model/task pairs, seven task types, a two-judge ensemble, and zero promoted models. Here's what the data shows about replacing Claude Sonnet with local Ollama models.

Nissan DookeranaievaluationmodelsRead →