Gemini Spark: what Google shipped at I/O

Two days ago I wrote about Remy, the leaked Google agent that looked like the most ambitious proactive assistant anyone had announced. Then I/O happened, and Google filled in most of the blanks. Remy is now Gemini Spark. The architecture is real, the scope is even broader than the leaks suggested, and one specific technical choice surprised me enough that I want to talk about it first.

The runtime stack powering Gemini Spark.↗

The stack underneath

The rebranding tells you something. "Remy" was an internal codename, reportedly a Ratatouille reference that felt too informal for a consumer product. "Gemini Spark" aligns with Google's existing spark icon across the Gemini product line. Short, optimistic, and built to last as a brand for years. The name changed between internal beta versions 17.22 and 17.23, with no capability changes underneath.

Under the hood, Spark runs on dedicated Google Cloud virtual machines, powered by Gemini 3.5 and orchestrated by something Google calls the "Antigravity" harness. The model is a dedicated variant optimized for extended tool use and multi-step reasoning. Internal beta code references it as "Spark Robin." This is a purpose-built runtime designed to keep tasks alive for hours or days, not a general-purpose chatbot with agent features bolted on.

Gemini 3.5 Flash also shipped at I/O as the backbone. Google claims it's four times faster than other frontier models at roughly half the cost. Whether those benchmarks hold under real workloads, the speed matters for a 24/7 agent: every millisecond of inference latency multiplies across a day of continuous background operations.

Availability is tighter than the leaked docs implied. Trusted testers got access this week. Beta opens next week for Google AI Ultra subscribers in the US only. No global rollout timeline yet.

Google services connect natively; everything else goes through MCP.↗

The MCP surprise

In the Remy article, I flagged one specific question: how would Google handle external providers like GitHub, WhatsApp, and Spotify? Would they build bespoke adapters for each one, or find some standardized path?

Google chose MCP.

Model Context Protocol is Anthropic's open standard for connecting AI models to external tools and data sources. Google adopting it for Spark is significant. The biggest consumer AI platform on earth just validated a protocol designed by a direct competitor. Beta code from the Gemini app already includes "MCP Tool Testing" category entries.

For Google's own services (Gmail, Calendar, Drive, Photos, Maps, Chrome), Spark connects through internal APIs. Those connections are native: shared auth, shared infrastructure, no webhook overhead. This is the ecosystem advantage I described in the Remy piece, and it's exactly as strong as expected.

For everything else, MCP provides the bridge. This is where it gets interesting for anyone building proactive agents. MCP means third-party developers can write a single connector that works across Spark, Claude, and any other MCP-compatible system. The webhook tax we wrote about (eight weeks per provider, signature verification, queue infrastructure, payload normalization) doesn't disappear, but it concentrates in the MCP server implementation instead of spreading across every client.

Spark appears across five distinct surfaces.↗

Surfaces everywhere

The leaked Remy docs described an agent inside the Gemini app. Spark goes wider. Google announced five distinct surfaces where the agent can show up:

The Gemini app remains the primary interface, but Spark is also reachable through email and chat. You can assign it work without opening a dedicated app. Android Halo is a new UI layer that shows agent progress at the OS level, letting you see what Spark is working on without switching contexts. And the Chrome agentic browser, coming this summer, lets Spark navigate the web directly: filling forms, reading pages, interpreting screenshots and PDFs.

Alongside Spark itself, Google announced Daily Brief: a synthesized digest pulling from inbox, calendar, and tasks. And Information Agents in Search, which monitor the web 24/7 for topics you specify and deliver results proactively (available to AI Pro and Ultra subscribers this summer). These aren't inside Spark exactly, but they share the same architectural DNA of background processing and triggered delivery.

Google Flow handles the planning layer. Complex multi-step tasks with creative collaboration, the orchestration surface for work that's too involved for a single prompt.

Run all of this through the three-primitives framework and Spark now scores higher than any proactive agent on the market:

	Pulse	Orbit	Spark
Clock	Overnight batch	Timezone-aware schedule	24/7 continuous on dedicated VMs
Listener	Gmail + Calendar (snapshot)	Gmail, Slack, GitHub, Figma, Calendar, Drive	Google services native + third-party via MCP
Inbox	Cards in ChatGPT	Claude web + mobile + Orbit apps	Gemini app, email, chat, Android Halo, Chrome
Can act	No	Unclear	Yes (purchases, messages, docs, web navigation)
Model	GPT-4o	Claude	Gemini 3.5 (Spark Robin variant)
Audience	ChatGPT Pro	Claude users	AI Ultra subscribers (US first)

What we got right, what we missed

The pre-I/O analysis predicted three things to watch. Here's how they landed.

The approval model: still vague. Spark announced five core capabilities (inbox management, meeting briefings, information digests, web automation, custom skills), and each implies a different risk level. Google's own documentation warns that Spark can "autonomously handle purchases and share information without asking." But the granular policies I was hoping to see (auto-approve calendar changes, always confirm purchases over $50) weren't part of the keynote. The tiered system is presumably still there under the surface. We just haven't seen it demonstrated.

External provider depth: MCP answers the "how" but not yet the "how deep." Spark inherits Gmail, Calendar, Drive, Photos, Maps, Chrome, and Android by default. Third-party MCP integrations launch this summer. Whether Spark can watch a GitHub repo for new issues and draft a response, or just read your notifications, depends on how ambitious those MCP servers turn out to be. The protocol is capable of rich bidirectional interaction. The question is whether anyone builds servers at that depth in the first months.

State and memory: Spark stores data from remote browser sessions, including login information, in a dedicated settings panel. That suggests durable state beyond session boundaries. The custom skills feature, which lets you write reusable instructions for weekly tasks via markdown-style commands, also implies persistent configuration. But nothing about learning from corrections over time. Does Spark remember that you overrode a purchase recommendation last Tuesday? Google hasn't said.

The biggest thing that the pre-I/O leaks got wrong was scope. I wrote about Remy as a single agent inside the Gemini app. Spark turned out to be a family of agent capabilities spread across five surfaces, with Daily Brief, Information Agents, and Flow all drawing from the same runtime infrastructure. Google isn't shipping one proactive agent. They're shipping a proactive layer across their entire product stack.

✦ Newsletter

Liked this essay?

Get the next one in your inbox. One email per essay, no spam.

Posted May 20, 2026 · AgentWorkforce

Issues, PRs, and arguments welcome on GitHub. Or email [email protected].