The eight-week webhook tax — Proactive Agents

The first time you add a webhook to an agent, it takes an afternoon. You wire up an Express route, log the body, and watch payloads arrive. It feels easy.

The second time you do it — for production, with the same provider, after the first one quietly dropped events for two weeks — it takes a sprint. The third time, when you've been burned enough that you're actually doing it correctly, it takes longer than that.

Here is what correctly looks like.

The plumbing per provider — × every integration.

What "production-ready" actually means

For one provider:

A public endpoint. Hosted somewhere, with a stable URL, behind your own TLS cert. Already non-trivial if your agent runs on a serverless runtime that doesn't expose long-lived URLs.
Signature verification. Each provider does this differently. Linear is HMAC-SHA256 with a header named X-Linear-Signature. GitHub is HMAC-SHA256 with X-Hub-Signature-256. Slack is HMAC-SHA256 plus a timestamp to prevent replay. Notion has no signing — you're supposed to allow-list IPs. Each one has its own quirks (Linear sends the timestamp in a separate header that you also have to include in the signed payload; GitHub gives you sha1 and sha256 and the older one is still default in some webhooks).
Sub-2-second response. Most providers retry aggressively if you don't respond in time. Linear retries for two hours. Stripe retries with exponential backoff for three days. If your endpoint blocks on anything (including the LLM call), you will get duplicate deliveries, and lots of them.
A queue. Because of #3, you cannot process the webhook inline. You have to ack-and-enqueue. That means Redis, SQS, or equivalent. Plus a worker. Plus dead-letter handling.
Deduplication. Providers send duplicates under load. You must store a seen-set keyed by their event ID. Without this, agents will act on the same event repeatedly. Twenty-four hours of dedupe is usually enough; less and you'll see duplicates from retried deliveries; more and you'll see your Redis bill go up.
Filtering. The webhook fires for many event types. You probably care about three. The rest you have to either filter out at the registration step (each provider exposes this differently) or at the worker (which means paying to receive and parse them).
Payload completion. Most webhook payloads are partial. Linear's issue webhook gives you the changed fields, not the full issue. You will need to call the API to get the complete object. And you will need to handle the fact that by the time your call lands, the issue may have changed again.
State for the agent. The webhook tells you what changed. Your agent needs to know where it left off the last time it touched this entity. That's a database. With migrations. And concurrent-write handling. And a backup strategy.
Webhook registration. You can't do this through the dashboard if you have multiple environments. So now you have a registration script. It needs to run on deploy. It needs to be idempotent. It needs to handle URL changes (your dev tunnel changed; your prod load balancer moved).
Observability. Webhook failures fail silently from the user's perspective. You need dashboards. You need alerts on dead-letter queue depth. You need to be able to replay a specific webhook delivery for debugging, which means storing raw payloads with sufficient retention.

All of that, for one provider.

“

The first time you add a webhook, it takes an afternoon. The third time — correctly — it takes longer than a sprint.

”

Now do it four times

Most proactive agents need to watch at minimum: a ticketing system (Linear or Jira), a code host (GitHub), a chat tool (Slack), and a CRM or doc store (Notion, Salesforce, HubSpot).

Each of those has its own version of all ten items above. Different signature scheme, different retry behavior, different payload schema, different registration flow, different answer to "is this payload partial?"

Provider	Signature header	Algorithm	Retry window	Partial?
Linear	`X-Linear-Signature`	HMAC-SHA256	2 hours	Yes
GitHub	`X-Hub-Signature-256`	HMAC-SHA256	3 attempts	No
Jira	`X-Hub-Signature`	HMAC-SHA256 (admin) or none	No retry	Yes
Slack	`X-Slack-Signature`	HMAC-SHA256 + timestamp	3 retries	No
Notion	None (IP allowlist)	—	No retry	Yes

So now you're looking at four endpoints, four verification implementations, four parsers, four registration flows, and four deploy scripts. Each one a new way for things to silently break.

In practice, the path from "we want our agent to react to changes across four providers" to "we have production-ready change detection against four providers" is six to eight weeks of one engineer's time. The engineer is not building the agent during those weeks. They are building plumbing.

The providers that don't even give you a webhook

All of the above assumes the provider actually offers webhooks. Many don't, or their webhooks are so limited they might as well not exist.

Google Workspace has push notifications through the Drive API's push channel, but the setup is heavier than a simple webhook: you need a verified domain, channel registration, periodic renewal (channels expire), and the notifications only tell you that a file changed, not what changed. You still need a follow-up API call to get the diff. Calendar changes work through a similar channel system. The infrastructure exists, but it's closer to building a subscription service than wiring up an endpoint.

Salesforce has Streaming API and Change Data Capture, but both require persistent connections (CometD or gRPC) rather than simple HTTP callbacks. You're not wiring up an endpoint — you're maintaining a long-lived subscriber process that reconnects on failure and manages its own replay cursor.

HubSpot has webhooks, but only for CRM objects and only if you're building a public app registered through their developer portal. Private integrations, the kind most agent builders start with, don't get webhooks at all. You poll.

For these providers, the fallback is always the same: scheduled syncs.

When there's no webhook — the sync fallback.

The scheduled sync tax

A scheduled sync means running a job on a cron (every minute, every five minutes, every hour) that calls the provider's API, compares the current state to the last known state, and surfaces what changed. You trade one set of problems for another.

Cursor management. Every sync needs a checkpoint: a timestamp, a page token, a cursor ID. You fetch records newer than your checkpoint, process them, then advance the checkpoint. If the process crashes between processing and advancing, you'll re-process records on the next run. If it advances before processing completes, you'll miss records. This is the same exactly-once delivery problem that webhooks have, except now you're the one implementing it from scratch.

Rate limits. Polling APIs hit rate limits fast. The Notion API allows three requests per second. The HubSpot search API has a 4-per-second-per-app limit. If your sync touches multiple object types or multiple workspaces, you need a rate limiter with per-provider, per-account bucketing. Miss this and your agent goes blind for minutes or hours while the rate limit window resets.

Change detection. Not every API gives you a clean "modified since" filter. Some return a lastModifiedDate you can filter on. Some return results sorted by creation date only, so you have to fetch everything and diff locally. Google Drive gives you a changes endpoint with a startPageToken, which is a proper changelog — but most providers don't have this concept at all.

Latency tradeoff. The faster you poll, the closer you get to real-time — and the faster you burn through rate limits. Most teams settle on five-minute intervals as a compromise. That means your proactive agent can be up to five minutes late on every reaction. For a deploy notification, five minutes is fine. For a security alert, five minutes is a long time.

Infrastructure. A scheduled sync needs: a cron runner (or a serverless timer), the sync logic per provider, a database for checkpoints and last-known state, a diffing layer to detect changes, and the same queue-and-worker setup you'd need for webhooks, because you still can't process synchronously if the LLM call takes ten seconds.

Why this matters for proactive agents specifically

A reactive agent doesn't need any of this. It has a simple lifecycle: receive prompt, think, respond. The user is the trigger and the context provider.

A proactive agent has a fundamentally different requirement: it needs to notice things on its own. A ticket was moved. A deploy failed. A document was updated. A customer churned. The agent needs to observe these changes, decide they matter, and act — without anyone prompting it.

That means a proactive agent's first dependency is a reliable stream of change events from the external systems it cares about. Not a one-time API call. Not a manual trigger. A continuous, durable, normalized stream of "this thing changed, here's what it was before, here's what it is now."

Building that stream is the webhook tax. And because every provider implements change notification differently — or doesn't implement it at all — the tax scales linearly with every system your agent needs to watch.

The hidden ongoing cost

The initial build is just where the maintenance starts.

Providers change their schemas. They give you ninety days notice if you're lucky. You will, at some point, ship a quiet break.
Your dev tunnel URL changes. You re-register. You forget to re-register one of the four. You discover it three days later.
A provider has an incident and replays twelve hours of webhooks at once. Your queue depth spikes before your autoscaling catches up.
A teammate adds a fifth provider. Two of the ten items are subtly wrong because they were copy-pasted.
Your model picks up a new schema. Old serialized payloads in the dead-letter queue are now unparseable.

None of this is bad engineering. It's just what the work actually looks like, and it doesn't stop.

What does this actually cost to run?

It's worth putting numbers on the ongoing cost, since most discussions focus on the build and ignore what comes after.

A polling-based agent watching four providers every five minutes makes 4,608 API calls per day. At typical rate limits, that's fine for one agent. At ten agents across three workspaces, you're at 138,240 calls per day, and you're bumping into per-account rate limits on Notion (3 req/sec) and HubSpot (4 req/sec). Most of those calls return "nothing changed." You're paying for compute and API overhead to learn, repeatedly, that the world is still the same.

An event-driven agent only wakes when something moves. Our weekly-digest agent, which watches four sources and clusters mentions into a GitHub issue, costs effectively nothing to run: ~8 Brave Search queries per week (free tier), one Gemini Flash call for clustering (fractions of a cent), and two GitHub API calls. Total monthly cost: under a dollar. The cron trigger is a single HTTP request from the scheduler.

The cost difference widens as you add providers. Each polled provider multiplies your API call volume linearly. Each watched provider adds a webhook endpoint that sits idle until something happens. The infrastructure cost (queue, worker, dedup store) is roughly the same either way. The API and compute cost diverges fast.

Cost driver	Polling (4 providers, 5-min interval)	Event-driven (4 providers)
API calls/day	~4,608	Proportional to actual changes
Compute	Cron runner + workers, always active	Workers wake on events only
LLM inference	Per poll cycle (even when nothing changed)	Per event (only when something happened)
Storage	Checkpoint DB + last-known state per entity	Dedup store + event log
Rate limit pressure	Constant, scales with agent count	Bursty, scales with change volume

The honest caveat: event-driven architectures trade steady-state cost for burst cost. A bulk import that fires 500 webhooks in a minute will spike your compute and LLM spend in ways that a five-minute poll never would. You need backpressure and spend guardrails for those moments. But the baseline cost of "watch four providers and act when something changes" is dramatically lower when you're not polling.

The listener as a primitive

This is why we think of change detection as a primitive — something that belongs underneath the agent, not inside it. The same way you wouldn't ask every agent builder to implement their own TCP stack, you shouldn't ask them to implement their own webhook verification, payload normalization, cursor management, and deduplication per provider.

A proactive agent needs a listener: a single interface that says "something changed in a system you care about" with enough context for the agent to act on it. Whether that change came from a webhook, a streaming API, a polling sync, or a Pub/Sub subscription is an implementation detail the agent shouldn't have to know.

The three primitives exist because each one represents a class of infrastructure that is hard to build, undifferentiated, and required for an agent to be proactive.

Anyway, that's what it looks like when you build the listener from scratch, one provider at a time. It's a ton of work, and most of it has nothing to do with the agent itself.

Posted May 10, 2026· AgentWorkforce

Issues, PRs, and arguments welcome on GitHub. Or email [email protected].