Google I/O 2026 keynote showcasing Gemini Spark, autonomous AI agents, multimodal AI systems, and next-generation enterprise AI infrastructure

Inside Google I/O 2026: Gemini Spark and the Rise of Autonomous AI Agents

Google made a huge announcement at I/O 2026, and it’s clear this isn’t just another model release. Something about it stands out.

The numbers are staggering. Two years ago, Google handled about 9.7 trillion tokens each month. Last year at I/O, that number jumped to 480 trillion. Now, it’s 3.2 quadrillion tokens per month, a seven-fold increase in just one year. This isn’t just hype; it’s real usage from people solving real problems.

The Gemini app’s monthly active users grew from 400 million last year to over 900 million now, more than doubling in a year. Daily requests also increased sevenfold. AI overviews in search now have 2.5 billion monthly users, and the new AI Mode in Search reached 1 billion users in its first year.

This is no longer just a tool for developers. It’s gone mainstream.

Gemini 3.5 Flash: AI agents with speed and intelligence

Starting with the models, Google introduced Gemini 3.5, with Gemini 3.5 Flash as the first release. What stands out is that Flash is no longer just a budget option; it’s performing far better than expected.

eOn the Terminal Bench 2.1 coding benchmark, Flash scores 76.2% versus Gemini 3.1 Pro’s 70.3%. It’s hitting 1,656 ELO on GDP Val AA compared to 3.1 Pro’s 1,314. On MCP Atlas, it’s getting 83.6% versus 78.2%. And on the Charsiv reasoning benchmark, it’s at 84.2%.

What’s most impressive is that it competes with, and sometimes outperforms, GPT 5.5 and Claude Opus 4.7, flagship models from OpenAI and Anthropic. Flash achieves this at four times the output speed. Artificial Analysis reports it processes nearly 280 tokens per second, compared to about 60-70 for GPT 5.5 and Opus 4.7.

You’re getting frontier-level intelligence at speeds that used to be reserved for much smaller models.

The economics are wild

The pricing? That’s where this gets really interesting.

During the keynote, Sundar Pichai mentioned that Flash delivers “cutting-edge performance in agentic AI” at less than half the price of comparable frontier models, sometimes nearly a third.

Subscribe to our bi-weekly newsletter

Get the latest trends, insights, and strategies delivered straight to your inbox.

Pichai shared that, “Two years ago, we were processing 9.7 trillion tokens a month across our surfaces, a huge number. Last year at I/O, that grew to roughly 480 trillion tokens. Fast forward to today, that number jumped 7x to over 3.2 quadrillion per month.”

The image depicts a line chart titled "Monthly Tokens Processed Across our surfaces," showing substantial growth in tokens processed from 9.7T in May '24 to 3.2Q+ in May '26, with a note indicating "7x Y/Y growth.
Image Source: I/O 2026: Welcome to the agentic Gemini era

Pichai further added, “It tells an important story about our products and how others are building as well, especially developers and enterprises: Over the past 12 months, over 375 Google Cloud customers each processed more than one trillion tokens, representing incredible demand for AI from across industries. Over 8.5 million developers are now building new apps and experiences with our models monthly. Our model APIs are now processing roughly 19 billion tokens per minute.”

He explained that if leading companies processing a trillion tokens daily moved 80% of their workloads to 3.5 Flash, they could save over a billion dollars each year. That’s a significant amount for CFOs managing growing AI budgets.

Gemini 3.5 Pro is launching next month. Google’s already using it internally, and they’re calling the improvements substantial.

Gemini Omni: The “world model” that understands physics

Then we get to Gemini Omni. This is where things move from impressive to genuinely next-level.

Google’s calling this a “world model.” Demis Hassabis from DeepMind described it as “a pivotal step toward artificial general intelligence.” Strong words. But after the keynote demos closed, Hassabis told reporters something even more direct: “We’re at the foothills of the singularity.”

The first model in this family is Gemini Omni Flash, and it’s fundamentally different from typical text-to-video generators. Unlike most video tools that just stitch things together frame by frame, Omni is truly multimodal in both input and output. You can feed it text, audio, images, and video all at once, and it generates content that actually makes sense, physically and scientifically.

They showed a protein folding example that legitimately impressed me. A smooth stop-motion sequence of amino acid chains twisting into alpha helixes and beta sheets, with properly synced voiceover narration. No Frankenstein editing. Just coherent scientific visualization.

Omni stands out because it is trained on all four data types simultaneously, enabling it to understand their relationships.

The editing features are also impressive. You can make changes step by step using natural language, with each instruction building on the previous one. Characters remain consistent, the physics stay accurate, and the scene keeps track of earlier actions.e.

They showed someone turning a sculpture into bubbles, making a mirror ripple like liquid when touched, and creating a rapid-fire alphabet video where each letter is represented by an unusual object, all with proper lower thirds and smooth music. The level of control is honestly impressive.

Gemini Omni Flash is rolling out today to Google AI Plus, Pro, and Ultra subscribers in the Gemini app and Google Flow. It’s also coming to YouTube Shorts and the YouTube Create app at no cost later this week. Developers get API access in the coming weeks. Google plans to expand it to support image and audio outputs down the line as well.

AI Agents, Deepfake safeguards, and SynthID

One thing they’re being careful about: deepfakes and misuse. All videos created with Omni include Google’s SynthID watermark, which is imperceptible but verifiable through the Gemini app, Gemini in Chrome, and Google Search.

They are also cautious with voice cloning. At first, you can only create videos using your own voice through their Avatars feature for editing existing videos. They say they’re still testing how to expand this capability responsibly.y.

Speaking of SynthID, Google announced major partnerships there. SynthID has now watermarked over 100 billion images and videos, along with 60,000 years of audio assets. They’re expanding content credentials verification to Search and Chrome. And they got OpenAI, Cacao, and 11 Labs to adopt SynthID as well. NVIDIA signed on last year.

This is becoming a real cross-industry standard for AI transparency.

TPU 8: The infrastructure behind everything

Next is infrastructure, an area where Google is showing its strength.

They announced their eighth-generation TPUs, and for the first time, they’re taking a dual-chip approach with specialized architectures. There’s TPU 8T optimized for training and TPU 8 optimized for inference.

TPU 8T offers nearly three times the computing power of the previous generation. What’s remarkable is how they handle training now. Using JAX and Pathways, training is no longer limited to one large data center. They can distribute training across multiple locations, scaling to over a million TPUs worldwide.

This gives them the ability to create what they’re calling the largest training cluster in the world, which means training larger, more capable models in weeks rather than months.

TPU 8 focuses on speed. Google has greatly reduced latency at every stage, drawing on 27 years of experience with Search, where latency is crucial. Both chips are also more energy-efficient, offering up to twice the performance per watt.

Google’s infrastructure investment is enormous. In 2022, they spent $31 billion a year on capital expenditures. This year, they expect to spend between $180 and $190 billion, which is about six times more than just a few years ago.

AI Agents are everywhere now

Google I/O wasn’t just about Google, though. It was about a much bigger shift happening across AI. Every major lab is racing toward the same thing: AI agents that don’t just respond, they actually help you get work done.

Google offers Gemini and Spark, while OpenAI has its own agent tools. Anthropic provides Claude, Artifacts, Connectors, and practical workflows available today. Industry experts are calling 2026 the year when agentic workflows move from demos to everyday use.

Andy Markus, AT&T’s chief data officer, told TechCrunch: “Fine-tuned SLMs will be the big trend and become a staple used by mature AI enterprises in 2026, as the cost and performance advantages will drive usage over out-of-the-box LLMs.”

That’s the context for what Google’s building.

Antigravity 2.0: More than just coding

Antigravity 2.0 is evolving from a simple coding environment into a complete platform for developing and managing autonomous AI agents.

A new standalone desktop app now serves as a central hub for interacting with agents and managing various tasks. Google has also developed a more optimized version of Flash for Antigravity, which is not just four times faster but actually twelve times faster than other leading models.

The amount of tokens Google is processing internally through their AI developer tools is pretty telling, too. In March, they were processing half a trillion tokens a day. Now they’re doing more than three trillion tokens a day, and they’ve been doubling every few weeks. That internal usage is creating this powerful feedback loop that’s helping them improve the models.

Google AI Studio is also getting major upgrades. It now includes native Kotlin support for Android app development, Google Workspace integrations, one-click deployment to Cloud Run, and support for Firebase services. You can build and launch full-stack apps directly within AI Studio. And if you want to keep building, you can seamlessly export your complete project to Antigravity.

They are also adding managed agents to the Gemini API, making infrastructure setup much easier. With one API call, you get a fully provisioned agent and a remote sandbox. For those who want more control, the new Antigravity SDK allows you to customize the agent and deploy it on your own systems.

Android gets smarter

For Android developers specifically, there’s a lot of new stuff.

The stable Android CLI allows AI agents to connect directly with Android Studio for tasks such as downloading the Android SDK and running apps on devices. Google has also open-sourced Android Skills to help language models follow best practices for complex workflows, like migrating to Jetpack Compose.

There’s also Android Bench, an LLM leaderboard for Android development tasks that now includes open-weight models like G. They also previewed a migration agent in Android Studio that can convert your app code into a native Kotlin Android app, regardless of your source being React Native, a web framework, or even iOS. The agent reviews your code and handles the complex work, reducing migrations from weeks to just hours.

Web development gets agent-first

On the web development side, Google’s proposing WebMCP, an open web standard that allows developers to expose structured tools such as JavaScript functions and HTML forms so browser-based AI agents can execute complex tasks faster, more reliably, and with greater precision.

The experimental WebMCP origin trial starts in Chrome 149, with Gemini support coming soon.

They’re also launching Modern Web Guidance, which helps you build more performant, accessible, and secure web experiences by providing your coding agents with expert-vetted skills. It supports over 100 use cases and integrates directly with Baseline. You can install it with a single click in Antigravity or via CLI.

Chrome DevTools for Agents is another major update. It brings Chrome DevTools features to AI agents, allowing you to scale your workflow by verifying, debugging, and optimizing code in real time. Your agent can automate quality checks, simulate real user experiences, and transfer sessions automatically, all without manual supervision.

There’s also this new HTML Canvas API available in the origin trial. It lets developers build immersive 3D experiences that remain fully searchable, accessible, and interactable by integrating real DOM elements directly into a canvas with WebGL and WebGPU.

Gemini Spark: The 24/7 personal AI agent

But the consumer-facing stuff is probably what most people will care about.

Gemini Spark is Google’s new personal AI agent that runs 24/7 on dedicated virtual machines in Google Cloud. It’s powered by Gemini 3.5 and the Antigravity harness, which allows it to perform long-horizon tasks in the background.

It’ll integrate with Google’s own tools first, and then with over 30 third-party tools through MCP, including Adobe, Dropbox, and Uber. You can work with Spark through the Gemini app, email, or chat.

On Android, there’s a new UI space called Android Halo coming later this year, where you can view live updates and task progress. Later this summer, Spark will operate directly within Chrome, acting as your agentic browser across the web.

Gemini Spark is rolling out to trusted testers this week, and the beta will be available to Google AI Ultra subscribers in the US next week.

It can gather relevant emails and documents to create updates for your boss, manage your calendar, and handle follow-ups. In short, it keeps working even when you are not.

Information agents and search get advanced

They’re also introducing information agents in Search, which are personalized AI agents you can set up to work in the background 24/7 to find what you need at the right moment and help you take action. These are rolling out this summer, starting with Google AI Pro and Ultra subscribers.

Search is also getting agentic coding capabilities powered by Gemini 3.5 Flash and Antigravity. Search will build custom experiences for your individual questions with dynamic layouts and interactive visuals. These generative UI capabilities will be available to everyone in Search this summer, free of charge.

For longer tasks, Search can create persistent custom dashboards or trackers that you can revisit and update, similar to mini apps designed for your needs.

AskYouTube

There’s a new feature called Ask YouTube that entirely reimagines the experience. You can ask complex questions, and it’ll show you videos that best match your interests. But more importantly, it jumps right to the part of the video most relevant to you.

This is starting to test now and will roll out broadly in the US this summer.

Image Source: I/O 2026: Welcome to the agentic Gemini era

Docs Live

Docs Live is another interesting feature. Instead of typing a detailed prompt, you can simply speak your thoughts and let Gemini handle the rest. You will be able to create and edit documents directly using your voice.

Docs Live is rolling out for subscribers this summer, and powerful voice capabilities will come to Gmail and Keep then too.

More features coming to Google products

Ask Maps lets you have more natural conversations with Maps for complex questions.

Daily Brief provides a personalized summary that combines information from your inbox, calendar, and tasks to highlight what is most important and suggest next steps.

Google Flow is getting a new agent that can plan and reason through complex tasks with your inputs. You can also “vibe code” any creative tool right in Flow, like tools for designing video effects, hand-drawn animations, or layering text.

Google Pix is their new AI image creation and editing tool built on the latest Nano Banana model. It treats every element as an individual object rather than a flat static image, so you can create, swap, or perfect specific details to bring your exact vision to life.

Pix is available to trusted testers now and will roll out later this summer to Google AI Pro and Ultra subscribers in Workspace.

Intelligent Eyewear: The future is wearable

Intelligent eyewear is also on the horizon, bringing a futuristic touch.

Audio glasses will launch this fall in partnership with Gentle Monster and Warby Parker. With these, you can ask Gemini about anything you see, get turn-by-turn directions, manage calls and texts hands-free, take photos and videos, receive real-time translations, and access your apps using only your voice.

Display glasses that show information right in your field of view are coming later.

Gemini for Science: Accelerating research

Google also announced Gemini for Science, which brings together AI tools to help accelerate scientific research. It includes new experiments on Labs and science skills to connect agentic platforms like Antigravity to over 30 major life science databases and tools.

AI Agents: Key takeaways for CTOs and tech leaders

If you are a CTO or technology leader following these developments, here are the key points:

1. Cost optimization is significant. Gemini 3.5 Flash’s pricing could fundamentally change your AI expenses. If you are spending millions on API calls to advanced models, it is time to evaluate Flash.

2. Agent orchestration is the next battlefield. Google, OpenAI, and Anthropic are all racing toward the same vision: AI that works autonomously across your stack. The question isn’t if you’ll adopt agents, it’s which platform you’ll bet on.

3. Infrastructure is more important than ever. Google is spending $180 billion on capital expenditures this year. This is not just about scale; it is also about speed, latency, and energy efficiency. Companies with the best infrastructure will train better models, more quickly.

4. Multimodal capabilities are now the standard. Text-only AI is becoming limiting. Omni’s ability to create coherent video with accurate physics is not just a novelty; it shows how AI will understand the world in the future.

5. Trust and transparency will differentiate winners. SynthID adoption across OpenAI, 11 Labs, and Nvidia isn’t accidental. As AI-generated content becomes indistinguishable from human content, verifiable provenance becomes a competitive advantage.

For more insights on how AI is transforming enterprise architecture and decision-making, check out CTO Magazine’s analysis on AI-native systems and the shift from apps to AI agents.

The real question is not whether Google will catch up to OpenAI and Anthropic, but whether any of these companies can build agents that people truly trust with their work.

In brief

That sums up Google I/O 2026.

Google is clearly pushing hard into the agentic era, where AI can create, plan, and actually take action across your digital life. Google DeepMind CEO, Demis Hassabis wasn’t being hyperbolic when he said we’re at the foothills of something bigger. Earlier this year, he predicted that AGI could arrive by 2030, telling reporters that “2030 is when I expect it to arrive, either plus or minus a year.”

The race isn’t just about who has the best model anymore. It’s about who builds the best agents, ones that integrate seamlessly into how we already work, communicate, and solve problems.

Now we just have to see how well all of this works outside the keynote demos.

Rajashree Goswami is a professional writer with extensive experience in the B2B SaaS industry. Over the years, she has honed her expertise in technical writing and research, blending precision with insightful analysis. With over a decade of hands-on experience, she brings knowledge of the SaaS ecosystem, including cloud infrastructure, cybersecurity, AI and ML integrations, and enterprise software. Her work is often enriched by in-depth interviews with technology leaders and subject matter experts.