Cerca: a cloud-native harness

Calvin French-Owen

CTO at Segment, Codex at OpenAI

Every team wants to adopt agents internally; to work with a fleet of always-on assistants who can access every system and tool.

Building the harness for those assistants is non-trivial. So most teams use an existing harness. Maybe it's the Claude Agent SDK or the Codex App Server. Some companies even pick up OpenClaw and use that.

This works, but you're picking your trade-off. You can keep a VM around and stay on top of systemd, upgrades, and patches. Or run it stateless and re-hydrate from scratch on every boot. Both are work.

That all feels a bit backwards to us. Every other part of our infrastructure is cloud-native... so why aren't our agents?

The Harness

You and your harness

As coding agents have advanced, we've learned a few things about what makes for a good harness. Agents work best when they have all of the following:

durable execution and context management. agents need to complete their work over minutes or hours. there has to be some form of failure recovery and context management as the scope of their work grows.

a sandbox to run CLI tools. agents work best when they have the full power of a CLI at their disposal.

custom tools and MCPs. to get good results, the agent needs access to the same information you do.

full control over context and approvals. the harness handles a lot of things out of the box for you. but you still want control over how the harness loads and discovers context as it executes.

That's why we built Cerca. Durable execution, sandboxing, context management, and approvals are all baked into the harness; so you don't have to stitch them together yourself.

We wanted it to run without upgrades, process restarts, or corrupted VM state. None of that should be your problem.

And we wanted it to be easy to scale. Running thousands of agents across your company should be as easy as making an API call — not configuring a Kubernetes service from scratch.

Here's what it looks like.

Figure 1.1configure an agent
// configure with just an API key
import { Cerca } from "cerca";
const cerca = new Cerca({ apiKey: process.env.CERCA_API_KEY });
// spin up as many agents as you'd like
const agent = await cerca.agents.create({
userId: "user_finance_ops",
fleetId: "fleet_internal_ops",
configuration: {
instructions: readPrompt("/system-prompt.md"),
tools: ["web.*", "sandbox.*"],
},
});
// seed standing context the agent can rely on
await cerca.context.write(agent.id, {
key: "user/runbook",
content: "Escalate invoices over $25k to the controller.",
});

configure a client, create an agent, seed its context.

Figure 1.2run and steer a thread
// kick off a thread to do work
const thread = await cerca.threads.create(agent.id, {
userMessage: "what sorts of tools do you have access to?",
});
// subscribe to events as the agent works
for await (const event of cerca.events.subscribeThread(agent.id, thread.id)) {
console.log(event);
}
// resolve any approval requests the agent pauses on
const { approvals } = await cerca.approvalRequests.list(agent.id);
for (const approval of approvals) {
await cerca.approvalRequests.resolve(agent.id, approval.threadId, approval.id, {
decision: "approve",
grant: "thread",
});
}

start a thread, stream events, resolve approvals.

Product Principles

Our principles

When building Cerca, we believe that the best harness is made to fit your needs as a business. That's why we designed the system from the ground up to...

  • be model-agnostic. switch models every time a major lab takes the lead, or mix and match depending on your use case. swap in open source models as they continue to improve.

  • be built for scalability and reliability. no more devops, and no more breaking on every version change, the harness itself is built to scale to millions of agents executing at once.

  • be secure from the start. credentials are encrypted and never exposed to the model. policies for network + approvals come out of the box. enforce them across agents, or allow users to override.

  • take care of the annoying parts. managing context, credential access, durable execution, sandboxes, and approvals is all handled by the API.

Coda

We make it easy.

Our team has scaled systems to one million RPS at Segment and built coding agents at OpenAI since the first reasoning checkpoint. We know what it takes to run production systems that enterprises depend on — and Cerca is built to meet that bar.

Our goal isn't to lock you in to one particular model or hold your data hostage. We want you to harness the full power of agents.

$curl -X POST https://api.cerca.dev/agents