Background agents and the factory model

Six months ago, "background agent" wasn't a product category. Now every major AI company ships one. Cursor runs tasks in the cloud while you switch tabs. GitHub Copilot assigns itself issues from your backlog. Devin clones your repo into a sandbox and opens a pull request before standup. Factory wraps the entire dev lifecycle in agents it calls Droids. OpenAI open-sourced Symphony, a spec for pointing Codex agents at your Linear board and letting them work through tickets overnight.

In each case, the agent works somewhere you aren't watching. No open chat window, no prompt to type. The agent picks up a unit of work, executes in an isolated environment, and surfaces a result for you to review. "Background" means the human isn't in the loop while the work happens.

The part worth watching is how the work starts.

Foreground agents wait for you. Background agents don't.↗

The foreground-background line

Most AI tools today are foreground. You type a prompt, the model responds, you refine. The agent's clock starts when you press enter and stops when you close the tab. That works for writing, exploration, and pair programming. It doesn't work for the large category of software tasks that are routine: triage, test fixes, dependency bumps, boilerplate migrations, doc updates.

Background agents flip the relationship. Instead of starting when you ask, they start when something happens: a ticket lands in the backlog, a test fails in CI, a PR gets flagged by a reviewer. The agent runs in its own environment (a cloud sandbox, a container, a managed VM) and delivers output asynchronously. You review a finished pull request, not a chat transcript.

It sounds like a small change in mechanics, but it reshapes what the agent can cover. A foreground agent tops out at however many hours you have. A background agent keeps pace with whatever your systems produce.

Two modes of background work — one waits, the other watches.↗

Reactive and proactive

Look closer and background agents split into two postures.

Reactive background agents wait for an explicit assignment. A ticket appears in Linear, a GitHub issue gets labeled agent, a Slack message says "handle this." The agent picks it up and works to completion. They work autonomously once started, but someone still has to point them at the work.

Most of what ships today is reactive. Symphony is a clean example: it polls your Linear board, picks up issues marked for agents, spins up a Codex instance per task, and delivers pull requests. The human decides what goes on the board. The agent decides how to build it. That division is comfortable because it maps to how teams already assign work to junior engineers.

Proactive background agents watch for signals and initiate work without being asked. A new error pattern appears in your monitoring. A dependency publishes a security advisory that touches a package in your lock file. The agent notices, evaluates whether action is warranted, and either acts or proposes a plan. Nobody filed a ticket. Nobody typed a prompt.

Think of it as the difference between a contractor who builds what you spec and a colleague who notices the CI pipeline has been flaky all week and opens a fix before anyone asks. We wrote about this distinction in reactive vs proactive, and the background-agent wave is making it concrete. Nearly every background agent shipping today is a contractor.

The factory model: agents across the full lifecycle, context flowing between stages.↗

The factory

Some teams are pushing the idea further: not one agent on one task, but a full factory of agents covering every stage of software delivery, with shared context flowing between stages.

Factory is the company that has taken this furthest. Their platform breaks the SDLC into discrete stages (triage, code generation, validation, release, documentation, monitoring) and runs specialized Droid agents across all of them. A signal enters at triage, gets routed to code generation, passes through automated validation, and arrives at release as a tested, reviewed deliverable. Factory reports processing 57,000 lines of generated code per day with a 98.7% validation pass rate. They closed a $150M Series C at a $1.5B valuation in April 2026, led by Khosla Ventures.

The factory model treats background agents as infrastructure rather than individual tools. Each stage feeds the next. The triage agent learns from which PRs pass validation. The monitoring agent surfaces patterns that feed back to triage. The system gets better the longer it runs.

OpenAI's Symphony approaches the same problem from the opposite end. Instead of building the factory itself, OpenAI published a specification: a reference architecture for wiring background agents to a project board. The reference implementation is written in Elixir, chosen because the BEAM VM's supervision trees handle concurrent agent processes with built-in fault tolerance. Each task gets its own supervised process. If an agent crashes, Symphony restarts it. The spec doesn't prescribe which model runs underneath. It's plumbing for the orchestration layer, not a product.

The open questions

Most of this series has focused on products that give a single agent proactive capabilities: a clock, a listener, an inbox. The background-agent wave raises a different set of problems. When you have dozens of agents running unattended, the orchestration layer matters as much as any individual agent's capabilities. How do you route context between stages? What does review load look like when agents open twenty PRs a day? Is the reactive model a stepping stone to proactive, or a stable equilibrium on its own?

We've been approaching this from the proactive side at AgentWorkforce, watching for changes across connected services rather than polling a ticket board. More on the architecture in upcoming posts.

✦ Newsletter

Liked this essay?

Get the next one in your inbox. One email per essay, no spam.

Posted June 16, 2026 · AgentWorkforce

Issues, PRs, and arguments welcome on GitHub. Or email [email protected].