
Tech • IA • Crypto
Google I/O 2026 showcased rapid AI scaling, new Gemini models, and a shift toward autonomous agents embedded across products and infrastructure.
Google reported processing over 3.2 quadrillion tokens per month, up from 480 trillion a year earlier and 9.7 trillion two years ago. The Gemini app surpassed 900 million monthly users, more than doubling year over year, while AI-powered search features reached billions of users. This signals a transition from experimental AI to global, everyday infrastructure.
The newly introduced Gemini 3.5 Flash outperformed prior flagship models on multiple benchmarks, including 76.2% on Terminal Bench 2.1 and 1,656 ELO on GDP Val AA. It competes with systems like GPT-5.5 and Claude Opus 4.7, while delivering speeds near 280 tokens per second, roughly four times faster than rivals. Google positioned it as both high-performance and cost-efficient.
Google stated that Flash delivers similar capabilities at less than half the price of competing frontier models. Large-scale users shifting workloads could save over $1 billion annually, highlighting a growing emphasis on efficiency as AI adoption scales.
Gemini Omni represents a step toward artificial general intelligence, combining text, audio, image, and video understanding in a single system. Unlike traditional generators, it models physical consistency, enabling realistic outputs such as accurate protein folding animations and synchronized audiovisual scenes.
Omni enables iterative, conversation-driven editing where scenes retain continuity, physics, and character consistency. Demonstrations included transforming objects, modifying environments, and generating structured multimedia sequences with coherent audio and visuals.
All generated content includes SynthID watermarking, now applied to over 100 billion images and videos and 60,000 years of audio. Adoption by companies like OpenAI, Nvidia, and ElevenLabs signals movement toward an industry-wide transparency standard.
Google unveiled eighth-generation TPUs, including TPU8T for training and TPU8 for inference. Training can now scale across over one million TPUs, reducing model development timelines from months to weeks. Efficiency improved to 2x performance per watt, alongside major latency gains.
Annual capital expenditure is projected at $180–190 billion, up from $31 billion in 2022, underscoring the scale of infrastructure required to sustain AI growth.
The Antigravity 2.0 platform expands into a full ecosystem for building and orchestrating AI agents. Combined with Gemini 3.5, agents can execute complex workflows, automate development tasks, and operate across environments with minimal setup via APIs and SDKs.
Google AI Studio now supports full-stack app development, Kotlin integration, and direct deployment. Tools like Android migration agents can convert apps across platforms in hours, while WebMCP aims to standardize how web agents interact with online tools.
Gemini Spark introduces persistent, cloud-based agents that operate continuously, managing tasks like scheduling, research, and communication. It integrates with Google services and third-party tools, reflecting a broader shift toward always-on digital assistants.
New features include Docs Live for voice-driven document creation, Ask YouTube for context-aware video navigation, Daily Brief for personalized summaries, and enhanced Google Maps interactions. Search is evolving into a dynamic, task-oriented interface with interactive outputs.
Tools like Google Pix enable object-level image editing, while AI-powered glasses—developed with Warby Parker and Gentle Monster—bring real-time assistance, translation, and media capture into wearable devices.
Google’s announcements highlight a decisive shift toward fast, cost-efficient models and autonomous agents embedded across products, signaling an industry-wide move from passive AI tools to systems that actively execute tasks.
All right, so Google just dropped a massive bomb at IO 2026. And honestly, there's so much to unpack here that I barely know where to start. But let me try to walk you through everything because this is genuinely one of the biggest AI announcements we've seen in a while. First off, let's talk numbers because they're kind of insane. 2 years ago, Google was processing about 9.7 trillion tokens per month across all their services. Last year at IO, that jumped to 480 trillion. Now, they're at over 3.2 quadrillion tokens per month. That's a seven times year-over-year increase, which is absolutely wild when you think about what that represents in terms of actual usage and real world problems being solved. And speaking of usage, the Gemini app itself has just exploded. Last year, it had 400 million monthly active users. Now it's sitting at over 900 million, more than doubling in just 12 months. Daily requests have grown over seven times in that same period. AI overviews in search now has 2.5 billion monthly active users. And the new AI mode in search has already crossed 1 billion monthly active users in just a year. These aren't small hobby projects anymore. This is mainstream adoption happening right in front of us. But let's get into the actual announcements. Starting with the models themselves, Google unveiled Gemini 3.5, and the first one out of the gate is Gemini 3.5 Flash. Now, what's really interesting here is that Flash is no longer just the budget option. This thing is punching way above its weight class. On the Terminal Bench 2.1 coding benchmark, it's scoring 76.2% compared to Gemini 3.1 Pros 70.3%. It's getting 1,656 ELO on GDP Val AA versus 3.1 pros, 1,314 and 83.6% on MCP Atlas compared to 78.2%. On the Charsiv reasoning benchmark, it's hitting 84.2%. What's even more impressive is that it's competing with and sometimes beating the flagship models from OpenAI and Anthropic. We're talking about GPT 5.5 and Claude Opus 4.7 here. And here's the kicker. It's doing this at four times the output speed of those other Frontier models. Artificial analysis puts it at close to 280 tokens per second versus around 60 or 70 for GPT 5.5 and Opus 4.7. So, you're getting Frontier level intelligence, but at speeds that used to be reserved for much smaller models. And the pricing, oh man, the pricing is where this gets really interesting. Sundar Pichai mentioned during the keynote that flash delivers Frontier level capabilities at less than half the price of comparable Frontier models, sometimes nearly a third of the price. He gave this example where if top companies processing about a trillion tokens a day shifted 80% of their workloads from other frontier models to 3.5 flash, they'd save over a billion dollars annually. That's not pocket change. That's real money they can reinvest back into their business. Now, Gemini 3.5 Pro is also in the works and should be launching next month. Google's already using it internally and they're saying it's showing great improvements. So, that's something to watch out for. But then we get to Gemini Omni and this is where things get really next level. Google's calling this a world model and Demis Hassabis from DeepMind described it as a pivotal step toward artificial general intelligence. The first model in this family is Gemini Omni Flash and it's fundamentally different from typical texttovideo models. Unlike most video generation tools that just stitch things together, Omni is truly multimodal in both input and output. You can feed it text, audio, images, and video all at once, and it'll generate realistic, scientifically accurate content that actually makes sense. The example they showed with protein folding was pretty compelling. A smooth stop motion sequence of amino acid chains twisting into alpha helyses and beta sheets with properly synced voiceover narration. The interesting part is Google IO wasn't really just about Google. It was about a much bigger shift happening across AI. Every major lab is racing toward the same thing. AI that doesn't just respond, it actually helps you get work done. Google has Gemini and Spark. OpenAI has its agent tools. Anthropic has Claude, Artifacts, Connectors, and some of the most practical workflows you can use right now. So, if you're watching all of this and thinking, "Okay, but how do I actually learn to use AI like that in my own work?" This is exactly where today's sponsor comes in. The world's first Claudathon is happening this weekend from 10:00 a.m. to 7:00 p.m. Eastern. It's a deep dive into Claude, its real use cases, and more than 10 other AI tools, and they have only 1,000 free seats available for a limited time. Millions of people across the world have already attended this workshop and it has a 4.9 out of five rating on Trustpilot. Inside the workshop, you'll learn how to do deep research with Claude, build your own artifacts and dashboards, create full presentations, set up Claude connectors like Indeed to automate your job search, master more than 10 AI tools, build custom GPTs and agents, and generate visuals and videos with AI. And if you sign up now, you also get 50 secret claude codes, an AI prompt library, and a personalized AI toolkit builder for free. You'll be mentored by leaders from Microsoft, Google, Amazon, and Nvidia. So, check the link in the description, scan the QR code on screen, and join the WhatsApp community before the free seats close. All right, now back to Google IO. What really sets Omni apart is that it's trained on all four data types simultaneously. So it actually understands relationships between them. A marble rolling down a track follows gravity correctly. A harp string plucked by a leaf produces the right sound at the right time. The physics actually hold up, which is something a lot of generative video models struggle with. And the editing capabilities are pretty wild, too. You can make iterative changes through natural language conversation. Every instruction builds on the last one. Your characters stay consistent. The physics remain coherent and the scene remembers what came before. You can take a video you shot and just ask Omni to change what's happening, add new characters or objects, transform specific elements, all without losing the thread of your original scene. They showed examples of someone turning a sculpture into bubbles, making a mirror ripple like liquid when touched, and even creating a rapidfire alphabet video where each letter is represented by an unusual object. All with proper lower thirds and smooth music. The level of control and coherence is honestly impressive. Now, Gemini Omniflash is rolling out today to Google AI Plus, Pro, and Ultra subscribers in the Gemini app and Google Flow. It's also coming to YouTube Shorts and the YouTube Create app at no cost later this week, and developers will get API access in the coming weeks. Google's planning to expand it to support image and audio outputs down the line as well. One thing they're being careful about is deep fakes and misuse. All videos created with Omni include Google Synth ID watermark, which is imperceptible but verifiable through the Gemini app, Gemini and Chrome, and Google Search. They're also being conservative with voice cloning, initially only letting you create videos with your own voice using their avatars feature for editing existing videos to change audio and speech. They say they're still testing to figure out how to bring that capability responsibly. And speaking of Synth ID, Google announced some major partnerships there. Synth ID has now watermarked over a 100red billion images and videos along with 60,000 years of audio assets. They're expanding content credentials verification to search and chrome. And they got OpenAI, Cacao, and 11 Labs to adopt Synth ID as well. Nvidia signed on last year. So this is becoming a real crossindustry standard for AI transparency. Then we get to the infrastructure side and this is where Google's really flexing. They announced their eighth generation TPUs and for the first time they're taking a dual chip approach with specialized architectures. There's TPU8T optimized for training and TPU8 optimized for inference. TPU8T has nearly three times the raw computing power of the previous generation. But what's really crazy is how they're doing training now. With Jacks and Pathways, their training is no longer constrained to a single massive data center. They can seamlessly distribute training across multiple sites, scaling across more than a million TPUs globally. This gives them the ability to create the largest training cluster in the world, which means training larger, more capable models in weeks rather than months. TPU8 is all about speed. They've dramatically improved latency at every step because, as they learned from 27 years of working on search, latency matters. Both chips are also more energyefficient, delivering up to two times better performance per watt. All of this infrastructure investment is pretty staggering. In 2022, Google was spending $ 31 billion annually in capex. This year, they expect that number to be around 180 to 190 billion. That's roughly six times what they were spending just a few years ago. Now, let's talk about anti-gravity because this is where a lot of the agentic magic is happening. Anti-gravity 2.0 is expanding beyond just being a coding environment. It's turning into a full platform to develop and manage autonomous AI agents. There's a new standalone desktop application that acts as a central home for agent interaction where you can orchestrate agents for all kinds of tasks. And they've developed an even more optimized version of Flash for anti-gravity that's not just four times faster, but 12 times faster than other Frontier models. The amount of tokens Google is processing internally through their AI developer tools is pretty telling, too. In March, they were processing half a trillion tokens a day. Now, they're doing more than three trillion tokens a day, and they've been doubling every few weeks. That internal usage is creating this powerful feedback loop that's helping them improve the models. Google AI Studio is also getting some major upgrades. It now includes native Cotlin support for coding Android apps, Google Workspace integrations, one-click deploy to Cloud Run, and support for Firebase services. You can build and launch full stack apps directly within AI Studio. And if you want to keep building, you can seamlessly export your complete project to anti-gravity. They're also introducing managed agents in the Gemini API, which removes the friction of infrastructure setup. A single API call gives you a fully provisioned agent with a remote sandbox. And if you want even more control, the new anti-gravity SDK lets you customize the agent and deploy it on your own infrastructure. For Android developers specifically, there's a lot of new stuff. The stable Android CLI lets AI agents tap directly into Android Studio to handle tasks like downloading the Android SDK and running apps on Android devices. They open sourced Android skills to help language models execute best practices for complex workflows like migrating to Jetack Compose. There's also Android Bench, an LLM leaderboard for Android development tasks that now includes openw weight models like Gemma 4. They even previewed a migration agent in Android Studio that can migrate your app code to a native Cotlin Android app regardless of whether your source is ReactNative, a web framework, or even iOS. The agent analyzes your code and does the heavy lifting, turning migrations that would have taken weeks into just hours. On the web development side, Google's proposing WebMCP, an open web standard that allows developers to expose structured tools like JavaScript functions and HTML forms so browser-based AI agents can execute complex tasks with greater speed, reliability, and precision. The experimental web MCP origin trial starts in Chrome 149 with support for Gemini in Chrome coming soon. They're also launching modern web guidance which helps you build more performant, accessible, and secure web experiences by providing your coding agents with expert vetted skills. It supports over a 100 use cases and integrates directly with baseline. You can install it with a single click in anti-gravity or via CLI. Chrome DevTools for agents is another big one. It brings Chrome DevTools capabilities to AI agents, helping you scale your workflow by verifying, debugging, and optimizing code in real time. Your agent can automate quality audits, emulate realworld user experiences, and hand over sessions with autoconnect, all without manual oversight. There's also this new HTML in canvas API that's available in origin trial. It lets developers build immersive 3D experiences that remain fully searchable, accessible, and interactable by integrating real DOM elements directly into a canvas with WebGL and WebGPU. But the consumerf facing stuff is probably what most people will care about. Gemini Spark is their new personal AI agent that runs 24/7 on dedicated virtual machines in Google Cloud. It's powered by Gemini 3.5 and the anti-gravity harness, which allows it to perform long horizon tasks in the background. It'll integrate with Google's own tools first and then with over 30 third party tools through MCP, including Adobe, Dropbox, and Uber. You can work with Spark through the Gemini app, email, or chat. On Android, there's a new UI space called Android Halo coming later this year where you can view live updates and task progress. Later this summer, Spark will operate directly within Chrome, acting as your agentic browser across the web. Gemini Spark is rolling out to trusted testers this week, and the beta is coming to Google AI Ultra subscribers in the US next week. It can do things like pull together relevant emails and docs to craft an update for your boss, manage your calendar, handle follow-ups, all that kind of stuff. They're also introducing information agents in search, which are personalized AI agents you can set up to work in the background 24/7 to find what you need at the right moment and help you take action. These are rolling out this summer starting with Google AI Pro and Ultra subscribers. Search is also getting a genic coding capabilities powered by Gemini 3.5 Flash and anti-gravity. Search will build custom experiences for your individual questions with dynamic layouts and interactive visuals. These generative UI capabilities will be available for everyone in search this summer for free. For longer running tasks, search can build persistent custom dashboards or trackers that you can return to and make progress on. Kind of like mini apps for your specific tasks. There's a new feature called Ask YouTube that entirely reimagines the experience. You can ask complex questions and it'll show you videos that best match your interest. But more importantly, it jumps right to the part of the video most relevant to you. This is starting testing now and will roll out broadly in the US this summer. Docs Live is another cool one. Instead of typing out a precise prompt, you can just verbally brain dump whatever's on your mind and let Gemini do the rest. You'll be able to create new docs and edit them directly, all with your voice. Docs Live is rolling out for subscribers this summer and powerful voice capabilities will come to Gmail and Keep then too. Ask Maps lets you have more natural conversations with maps for complex questions. Daily Brief gives you a personalized digest that synthesizes information from your inbox, calendar, and tasks to find the most important things you need to be aware of prioritizing and suggesting next steps. Google Flow is getting a new agent that can plan and reason through complex tasks with your inputs. You can also vibe code any creative tool right and flow like tools for designing video effects, handdrawn animations, or layering text. Google Pix is their new AI image creation and editing tool built on the latest nano banana model. It treats every element as an individual object rather than a flat static image. So you can create, swap, or perfect specific details to bring your exact vision to life. Pix is available to trusted testers now and will roll out later this summer to Google AI Pro and Ultra subscribers in Workspace. And then there's intelligent eyewear, which is pretty futuristic. Audio glasses are launching this fall in partnership with Gentle Monster and Warby Parker. You can ask Gemini about anything you see. Get natural turnbyturn directions. Manage calls and send texts hands-free. Snap photos and videos. Get realtime translations. And tap into your apps just by using your voice. Display glasses that show information right in your field of view are coming later. Google also announced Gemini for science, which brings together AI tools to help accelerate scientific research. It includes new experiments on labs and science skills to connect agentic platforms like anti-gravity to over 30 major life science databases and tools. So yeah, that's Google IO 2026. Google is clearly pushing hard into the agentic era where AI can create, plan, and actually take action across your digital life. Now we just have to see how well all of this works outside the keynote demos. Let me know what you think in the comments. Subscribe for more AI and tech updates. Hit the like button if you enjoyed the video. And thanks for watching. I'll catch you in the next one.