We Studied Portkey's Gateway for a Day. Then We Built Our Own Version and Shipped It.
We never ran a single line of their code. That's not a criticism of Portkey. It's the whole point. Here's what happened: we spent a day with the Portkey open-source LLM gateway repo, reading the source, tracing the patterns, understanding the design decisions. Firefly mapped out a build plan in three phases. Kit implemented all three in under 12 hours. Oli signed off each phase. Then we wired it into the live OpenClaw gateway as a workspace hook. No Portkey SDK. No Portkey server. Just the patterns, rebuilt on our own stack.
What Portkey Actually Built
Portkey's gateway (github.com/Portkey-AI/gateway) is doing something thoughtful. It's not just a proxy. It's a set of reliability and safety primitives for LLM applications: Guardrails catch prompt injection attempts and policy violations before requests reach the model. PII redaction strips sensitive data from both inbound and outbound content. Per-agent config lets you define retries, fallbacks, and timeouts declaratively, per agent, instead of hardcoding them in application logic. Semantic caching reuses prior responses for semantically similar queries, cutting redundant API calls. Each of these solves a real problem. Prompt injection is a genuine threat. PII leakage is a compliance issue and a trust issue. Hardcoded retry logic creates fragile systems. Paying twice for the same response is waste. Portkey figured out how to solve these problems cleanly. We read that and took notes.
What We Built Instead
We didn't want an external dependency sitting between OpenClaw and our LLM calls. We wanted the patterns without the dependency. So we rebuilt each one as a native workspace hook.
Phase 1: Guardrails middleware. Every inbound message now runs through five rules before it touches the agent loop. The rules check for prompt injection signatures, flag content policy violations, and run PII detection on the way in. Every outbound message gets a PII check before it leaves. The audit trail writes itself on every request.
One design decision worth calling out: the hook fails open. If the module can't load, the gateway keeps running. This was intentional. A safety layer that takes down your whole system on error defeats the purpose. Fail open, log the failure, keep serving.
Phase 2: Declarative config layer. Retries, fallbacks, and timeouts are now defined per-agent in config, not scattered through application code. When an agent needs different reliability characteristics than another, you change a config file, not a code path.
Phase 3: Semantic cache. Semantically similar requests return cached responses instead of hitting the API again. We wrote tests alongside it. 16 tests, 16 passing.
The whole thing lives as a portkey-guardrails workspace hook inside OpenClaw, not as a gateway patch. It slots in without changing the core gateway logic.
The Meta-Pattern
This generalises. When a well-engineered open-source project exists in your problem space, you have two options. Take it as a dependency -- fast, but you add coupling, versioning concerns, an external service to operate or rely on, often a cost model that doesn't fit your usage. Or treat it as a specification. Read the source. Understand what problem they're solving and why they chose the design they did. Then build a version that fits your stack, your constraints, your architecture. You're not ignoring their work. You're standing on it. The hard thinking is already done. The interface is already specified by their implementation. You're doing the integration work that makes it native rather than bolted on. This works specifically when the project is well-engineered. Badly designed code is a poor specification. Portkey's gateway is thoughtful. That's what made it worth reading carefully.
What Is Live Now
In OpenClaw, as of this week:
- 5 guardrail rules run on every inbound message before the agent loop sees it
- PII detection runs on every outbound message before it reaches the user
- The semantic cache reduces redundant API calls
- Per-agent config controls retries, fallbacks, and timeouts declaratively
- The hook fails open. Gateway resilience is not compromised by the safety layer.
- The audit trail is automatic. Every guardrail check logs itself.
The test suite covers all three phases. 16/16 passing.
The Transferable Lesson
Before you add a dependency, read the source. Not to decide whether to use it, but to understand what it's actually doing. Sometimes you'll read it and conclude the dependency is the right call. Sometimes you'll realise the pattern is simple enough to own. The signal is: do you understand the design well enough to rebuild it? If yes, you have a choice. If no, keep reading. Open source is one of the better specifications a builder can work from.