ENFR
8news

Tech • IA • Crypto

TodayMy briefingVideosTop articles 24hArchivesFavoritesMy topics

Bring the power of on-device AI to life with Google AI Edge and Gemma

GoogleGoogle for DevelopersMay 22, 2026 at 07:00 PM31:38
Audio player
0:00 / 0:00

TL;DR

Google is advancing on-device AI with faster, smaller models and a unified edge stack that enables powerful applications without internet connectivity.

KEY POINTS

Breakthrough in small language models

New iterations of Gemma models, including Gemma 4B, now outperform significantly larger predecessors like Gemma 3 27B on several benchmarks. This marks a sharp improvement in efficiency, showing that compact models can deliver high-quality results while running locally on consumer devices.

Fully offline AI capabilities

Modern on-device AI systems can operate entirely without internet access, enabling use cases in low-connectivity environments such as flights or remote areas. This also reduces reliance on cloud APIs, lowering operational costs and addressing privacy concerns by keeping sensitive data on-device.

Hardware acceleration drives performance

Advances across CPUs, GPUs, and NPUs are fueling the shift. Dedicated NPUs can exceed 200 tokens per second, while optimizations like ARM SMN2 and improved runtimes deliver up to 6× faster responses. The primary bottleneck is now memory bandwidth rather than compute power.

Massive ecosystem adoption

Google’s AI stack is already widely deployed, with over 250,000 Android apps running across 3.8 billion devices and generating more than 1 trillion daily inferences. This signals that on-device AI is transitioning from experimental to mainstream infrastructure.

LiteRT-LM enables local LLM deployment

The LiteRT-LM runtime provides a streamlined text-in/text-out interface optimized for edge devices. It supports multiple platforms including Android, iOS, Windows, Linux, and macOS, and integrates hardware acceleration automatically, simplifying deployment of local language models.

Real-world use cases: gaming and messaging

On-device AI enables dynamic features such as conversational non-player characters in mobile games without latency or server costs. Companies like Kakao have integrated local models into messaging apps, reducing memory usage by over 600 MB while maintaining smooth performance through GPU and CPU optimization.

MediaPipe offers plug-and-play AI tasks

For developers seeking quick integration, MediaPipe provides ready-to-use tools for vision, audio, and text tasks such as pose detection and gesture recognition. These APIs allow rapid deployment of features like motion-triggered photography without building models from scratch.

Custom model deployment with LiteRT

Developers can also bring their own models from frameworks like PyTorch or JAX, convert them into optimized formats, and run them efficiently on-device. Companies including Adobe, Snap, and Uber report performance gains of around 30%, while enabling real-time capabilities such as AR and image processing.

Edge AI for specialized environments

On-device AI is expanding into niche applications such as environmental monitoring. Custom low-power models can run continuously on IoT devices, identifying patterns like bird calls in remote forests while preserving battery life through NPU optimization.

CONCLUSION

Rapid improvements in compact models and edge hardware are making on-device AI practical at scale, positioning it as a cost-efficient, private, and low-latency alternative to cloud-based intelligence.

Full transcript

[MUSIC PLAYING] SACHIN KOTWANI: Hi, everyone. I'm Sachin Kotwani. And I'm a group product manager on the Google AI team. Presenting with me today is Erin Walsh, developer relations engineer on the team. Today, we are going to talk about Google's suite of tools for implementing on-device AI anywhere on the edge, from mobile to desktop to IoT. 2026 is a special time for on-device AI. There is an electrifying energy and enthusiasm about what's possible and what's yet to come. Now, on-device AI has been around for many years. But initially, it was limited to computer vision and general models. You have been able to run LLMs on device for a couple of years. But let's be real. They weren't that great. Their usefulness was limited, especially when compared to Cloud-based models. This year, however, LLMs have made a huge leap. Just look at these-- look at this chart. And you'll see the impressive evals for the latest Gemma 4 E2B and E4B models. The 4B model actually outperforms the much larger Gemma 3 27B from last year on most of these metrics, not all, which shows how much things have changed. But better than telling you, let me show you. If you can switch to the Pixel 10, I have the Google Ads gallery app here. And I'm going to take a picture. And I'm just going to say, describe. Let's keep it simple. There you go. You can see how detailed it is and how fast this is decoding. Just imagine the kind of things the kind of applications that you can build with that. So let's read here. This is a portrait-style photograph of a man with a warm and friendly expression. Thank you. The man appears to be middle-aged. [LAUGHTER] All right. Maybe I'll have to double think what I said about the Gemma accuracy. I think it was pretty good. All right, we'll go back to the slides. OK, so not only are the models raw capabilities, as you just saw, and the underlying runtime impressive, but also the fact that all of this is running entirely on the device without an internet connection. This helps reduce Cloud API costs when the local model can do a good job at the task at hand. It can also allow you to work offline, for example on a flight, or for use cases where Cloud latency may be greater than acceptable for your application, or where a strict data requirements may prevent you from sending the data to the Cloud. And it's not just the models that have gotten better. Devices are now more capable too. And that's because of improvements across compute targets. On CPU, Android now integrates arms SMN2 with an excellent pack and LiteRT, which allows models like Gemma to deliver up to six times faster responses. What's more important, developers benefit from this automatically by using the supported libraries and frameworks. GPUs are no longer just for gaming. They have evolved into efficient, high-performance engines with dedicated Tensor cores for AI processing. And now, dedicated NPUs can run models at speeds of over 200 tokens per second and more power efficiently than ever before. The hardware is so fast now that the bottleneck are no longer the processors. It's how fast the phone can move data from memory to the chip. The advances in both hardware and small-language models like Gemma 2B and 4B together make now the perfect time to bring on-device AI to your applications. And Google AI Edge can help you with that. Whether you want to take advantage of ready-to-use tasks such as pose detection or gesture recognition or you want to run a language model or bring your own custom model, you'll find everything you need in this comprehensive suite of tools. The AI solutions can run across a wide variety of platforms. So you can build your solution once and deploy it on Android, iOS, web, desktop, and IoT. And you wouldn't be the only one to do so. Developers have been using our stack for quite some time now. Here are some numbers. There are over 250,000 apps just on Android using the Google AI stack. Those are running on over 3.8 billion devices, which are generating over a trillion daily interpreter invocations per day. Google AI solutions seamlessly utilize hardware acceleration, ensuring your models run as efficiently as possible, depending on the available CPU, GPU, or NPU. We are fully committed to maximizing hardware acceleration across the ecosystem. Last year, we released production-grade NPU support for Qualcomm and MediaTek chipsets via LiteRT. And today, we are super excited to announce two new leader NPU backends-- Google Tensor and Intel. Meanwhile, we are also making steady progress on integrating additional support for Broadcom, Raspberry Pi, and Exynos. Erin, do you want to tell us how our developers can take advantage of all of this? ERIN WALSH: Absolutely. So that's really all exciting stuff. And it sounds like there are tons of great tools in the AI stack. But in the world of on-device AI, there are so many tools, models, frameworks, platforms. So as a developer, where do I fit into all this? To answer that, throughout this talk, we will take a look at three different developer personas and how they can navigate the Google AI Edge stack and for their distinct use cases. Meet Meghan, Rob, and Chris. All three developers are building on-device AI features in different ways. Meghan is exploring different LLMs. Rob wants to quickly ship common-use cases to a wide range of devices. And Chris is building a hero feature that requires control of the whole ML pipeline. So we'll see how each one of them would use the stack. Let's start with Meghan. Meghan is an indie game developer. And she's building an open-world fantasy role-playing game for mobile devices. She wants to leverage the power of LLMs to enhance her game. Sachin, what options do we have for Meghan? SACHIN KOTWANI: All right. So for Meghan, we have a full set of tools. First, as we saw earlier, she can always use the Gallery app to explore and see what's possible with out-of-the-box models like Gemma. What's more interesting is that you can use the open source code of Gallery to create your own app, whether it's by forking it or pointing your agentic hardness to the source code. You can also download Gallery on your phone now on both Android and iOS and play with it. Under the hood, all of this uses LiteRT-LM. and while LiteRTs are runtime for Tensor-in, Tensor-out processing, LiteRT-LM adds a text-in, text-out interface. It provides high-performance inference that's specifically optimized for Edge devices. And it leverages hardware acceleration and optimized kernels specifically for LLM operations like KV caching. And of course, it's designed to work seamlessly across Android, iOS, Windows, Linux, and Mac OS. In terms of models, you can pick from the various versions of models like Gemma, which we have optimized and prequantized for you in LiteRT-LM format. And they're available on Hugging Face. Optionally, you can also create a custom finetune of the model based on your own data for your specific problem or domain. This will be particularly important for the smaller models, such as the 1B or smaller than that. And optionally, you can test your implementation across a wide range of real devices, using Google Edge portal, which we'll talk about a little later. Erin, you want to go into Meghan's specific use case in a little more detail? ERIN WALSH: Yeah. So it sounds like LiteRT-LM is an ideal solution for Meghan as she navigates the world of on-device LLMs. But what exactly is she looking for? Like we mentioned earlier, Meghan is building her mobile game. And she wants her world to feel alive. Instead of players clicking through the same pre-written dialogue trees every time they try to talk to a character, she wants dynamic, conversational, non-player characters or NPCs. If the player asks a blacksmith about the weather or tries to haggle for a sword, the NPC should react naturally and stay in character. Initially, Meghan considered hooking her game up to a cloud-based LLM but came across a few challenges. For starters, hitting servers every single time a player stops to chat with a random town guard would burn through her budget very quickly. Plus, mobile gamers play everywhere. They play on subways, airplanes, the back of cars. And if an NPC freezes because the player dropped cell service, then the immersion is broken. And even with a perfect connection, relying on the cloud introduces latency. In gaming, waiting for a server response would make a dynamic conversation feel a little clunky. So Meghan realizes that a good solution is to run the intelligence entirely on the player's device. She needs an LLM that is lightweight enough to run on a smartphone without watering out the battery, but capable enough to hold a fantasy conversation. So she does a little research and chooses Gemma 270M to power for her NPCs. As she's building her game, she can simply use system prompts for each character. Like, you are a grumpy blacksmith from Iron Haven. You hate Wizards. And you love gold. In doing so, she generates fun NPC interactions for her game. So now that Meghan has figured out her NPC solution and chosen her on-device model, how does she go about running Gemma 270M in her game. This is where she dives right into the LiteRT-LM APIs. LiteRT-LM offers native APIs in Kotlin, Swift, C++, recently JavaScript and Flutter. And all she needs to do is include the library into her project. LiteRT-LM abstracts away the complex boilerplate of on-device AI. It provides Meghan with the methods to load Gemma 270M into memory, manage the ongoing chat session so the NPC remembers what the player just said, and run fast inference using the device's local hardware. And the result is Meghan's players get to experience a deeply immersive world where they can play anywhere at no server cost, all powered in the background by LiteRT-LM. Earlier, Sachin demoed and mentioned the AI Edge gallery app. But it's important to know it's much more than just a showcase for a really cool LLM demos. It's also a highly practical developer resource. Because the entire source code for Gallery app is available on GitHub, Meghan can look right under the hood. She can see exactly how the Gallery app uses LiteRT-LM to load models, manage local sessions, and run inference, and use that implementation as a proven blueprint for her own application. She can even use the Gallery app source code as a blueprint for agentic coding. Meghan can give her coding agent of choice her model file and the Gallery app GitHub repo and spin up an app that runs inference locally on her LLM in minutes. In fact, let's show this prompt on screen with Google Antigravity right now. So can we switch to the computer screen? OK, so we are opened up an Antigravity. All we have is a completely blank folder. I've preloaded this model into it. And so you can get this model just by going to the LiteRT community on Hugging Face. You can download the model straight from there. And all it is just an empty folder with just the model in it. And so we will copy and paste this prompt into Antigravity. We'll let it run. And for the sake of time, we're going to skip to what it builds. So here we are. It'll come up with-- so first what it does is it downloads the Gallery app as a reference code over here. So you'll see it referencing the Gallery app. And then, it will give you an implementation plan. It'll ask you some clarifying questions. Like, do you want to default to CPU or use different hardware acceleration? And then, you go back and forth with it on there. And it'll give you a walkthrough of what it builds. And so here in this prompt, we've asked it to just build a very simple chat app. So we've downloaded the model from Hugging Face. It's in LiteRT-LM format. And we just want to instantly make a chat app for our phone. So can we please go to the phone? So here we are. We have Gemma chat. The phone is on airplane mode right now. So there's no connection anywhere. And we're just going to ask, hello, which model are you? And run it. And you can see it types right back. So it's pretty easy to grab a model file and spin up an app that runs local inference on it using LiteRT-LM in very short time. Can we go back to the slides? So back to Meghan, once she has everything smoothly running on her own test device, she faces the reality of deployment. And how will her game actually perform in the wild? Mobile games are already memory intensive, just from 3D rendering and audio. Meghan needs to know exactly how much extra memory Gemma 270M is going to consume, so she can guarantee her game won't crash on older phones. Instead of guessing, she uploads her model to the Google AI Edge portal. This instantly unlocks access to a massive cloud-based fleet of real physical devices, allowing her to benchmark her model's performance on the variety of different phones her users might be playing on. And while an open model right out of the box is a great starting point for her game, there are a few ways Meghan can level up her LLM implementation. For example, if she wants her characters to speak in a distinct fictional dialect, she could go even further and fine tune an LLM for her NPC characters. To do this, she can start by consulting the Gemma cookbook. It's a detailed resource packed with practical, step-by-step guides for fine tuning open models. Once she has fine tuned her model, she can't run it in her app just yet. So she uses the Google AI Edge Torch tool to convert her model weights into the optimized, LiteRT-LM file format, the same one we used with Google Antigravity earlier. And once she has that LiteRT-LM file, she can use the LiteRT-LM CLI tool to run her model locally and test different presets and prompts to ensure the model is performing up to her standards. The tutorials on the Google AI Edge documentation will walk her through every step of this process. And once Meghan is satisfied with her new model, she doesn't have to rewrite her game. She can literally just drag and drop the model files right back into her existing LiteRT-LM architecture for her game. And if Megan is targeting Android users specifically, Gemini Nano, which is based on Gemma, is a great option. Gemini Nano is integrated directly into the OS of the latest Android devices via AICore and natively hardware accelerated. Checkout Karen Chang's I/O talk, "Deploy Android On-Device AI with ML Kit, GenAI" and the ML Kit documentation to learn more about using AICore for Android. So now, Meghan's game is officially ready to be launched. And no matter where her AI journey takes her next, LiteRT-LM gives her the flexibility to keep her NPCs running fast, local, and entirely on device. So now that we've gone through a fictional persona scenario, let's talk about a real-life LiteRT-LM user journey. Sachin, do you want to tell us about Kakao? SACHIN KOTWANI: So LiteRT-LM provides a very simple text-in, text-out interface, very easy to use. But it's actually really powerful. If you have needs that go beyond fine tuning, you can use this flexible LiteRT-LM framework for more custom implementations. And that's what Kakao did. Do we have Rex and the Kakao team in the audience? We're going to be here? So Kakao-- oh, right there. Hey! Thanks for coming. So Kakao is using LiteRT-LM to deploy the Nano 1.3 billion-parameter model as a built-in feature of the KakaoTalk Android app, which allows them to offer real-time, on-device AI interactions without relying on external servers. They shrank the original model's runtime footprint by over 600 megabytes through advanced memory mapping and KV caching optimizations. In addition to this, custom OpenCL priority settings were implemented to ensure that the AI runs in the background without causing UI jank or any kind of lag for the user. To maintain high quality across the Android ecosystem, Kakao leveraged LiteRT-LM's flexible backends to route pre-processing based on the chipset, utilizing GPU acceleration where available in CPU-only execution for others, ensuring stable, accurate outputs regardless of the hardware. So now, you've seen how powerful LiteRT-LM is for running your LLMs on device. But what if you're looking for a simpler way to leverage AI? What if you just want to quickly drop in a common AI capability without having to build it yourself? Let's pivot to Rob. He's looking for pre-built, plug-and-play AI solutions that just work right out of the box. And this is exactly where MediaPipe tasks come in. So if you're looking for ready-to-use AI tasks across audio, vision, and text, MediaPipe tasks allow you to seamlessly do things like image classification, post detection, hand gesture recognition, and more. These are super easy to implement. And there are thousands of apps using them. Erin, even though they are really easy to implement, why don't you tell us how to go about it? ERIN WALSH: Yeah, definitely. MediaPipe is really fun. So let's follow Rob's journey with MediaPipe. Rob is a talented, cross-platform developer. And his product is native Android, iOS, and web apps used across thousands of devices. Rob wants to add powerful, everyday AI features to his products that would reach his very large user base. And he doesn't have time to reinvent the wheel. He just needs reliable features, minimal overhead, and to meet his Friday ship date. So MediaPipe tasks is Rob's toolkit of pre-built, highly optimized AI tasks that he can drop right into his apps, no matter which platform he's deploying to-- on mobile, web, or desktop. Today, Rob is building a new mode for his selfie app that automatically sends a picture the exact moment a user jumps, capturing that perfect mid-air flying shot. But how does the app actually know the precise second the user leaves the ground? Rob doesn't have time to build and train a custom computer vision model from scratch just to track body positions. And the beauty is, he doesn't have to. Instead, he can just grab the pose, landmark, or task from the MediaPipe vision APIs. By tracking the y-coordinates of the user's shoulders frame by frame, the app watches for that value to drop as the user jumps upward. And the moment their upward momentum peaks, the app instantly snaps that perfect mid-air photo. To get Pose landmarker up and running, Rob can dive straight into the MediaPipe documentation for step-by-step guide or he can simply let his AI agent handle the implementation for him. Either way, MediaPipe APIs are designed to make complex AI tasks incredibly simple to integrate. And when he's ready for some inspiration for his next big feature, he can head over to the MediaPipe gallery in Google AI Studio. It's a great place to explore fun example projects and even vibe code a few prototypes on the fly. So let's take a look at an awesome project that actually uses the exact solution that Rob chose. It's an interactive twist on the classic Chrome dino game where you still have to jump over the cacti. But this time, I'm going to be the dino. So let's show the computer. All right. So we have Google AI Studio pulled up. There's this MediaPipe tab right here. And you can see a bunch of fun projects, give them a little preview. And today, we're going to play "Dino Jump." So I turn on the camera feed debug. So it shows you exactly what pose landmark is doing with my shoulders. And we're going to step back. And we're going to play this game. So I got to start it off. And then, there we go. And one more. OK. So that was really fun. SACHIN KOTWANI: Pretty good. ERIN WALSH: Thanks. All right, let's go back to the slides. Oh, we're still jumping. So MediaPipe is incredible for dropping in those quick, pre-built tasks. And Rob will have his new feature ready in no time. But what happens when your use case isn't on that list? We've talked about LLMs and plug-and-play solutions. But we all know the world of on-device AI is so much broader than that. So let's move to our third developer, Chris. What if you need to run a completely custom model using a classical ML architecture? Sachin, why don't you tell us more about that? SACHIN KOTWANI: Thank you. All right, so we talked about text-in, text-out LLMs and high-level APIs. But if your needs call for a custom model that's not covered by either of these, you can also bring your own custom model from a variety of frameworks. LiteRT allows you to do just that in a Tensor-in Tensor-out format. For example, you can convert models like YOLO to run on device using LiteRT. We offer an easy-to-use set of tools that makes-- that takes you all the way from model conversion to a runtime that runs a portable TFLite file format, all with industry-leading performance across CPU, GPU, and NPU. And we have many examples of successful deployments where developers brought their own models. The Adobe Literoom team uses LiteRT for several AI features. They are reporting up to 30% performance improvement in various image-editing functions, including Select Subject, Select Sky, Scene Enhance, Adaptive Portrait thanks to the latest ML Drift GPU delegate that's part of LiteRT-LM. Snap is seeing approximately 30% performance lift on many use cases when compared to the previous generation TensorFlow Lite GPU delegate. Epic's Unreal Engine enables 30 frames per second AR experiences by leveraging the NPU via LiteRT. Uber deploys various AI-powered capabilities using LiteRT on millions of Android devices. Argmax offers audio-based models and an SDK for real-time audio transcription via the NPU integration. And Pico OS offers fast, private, on-device capabilities for a jet-lag free-- for lag-free-- jet lag is what I have-- for a lag-free spatial experience on Project Swan headsets. That is, of course, in addition to the many Google apps that use LiteRT to power a variety of capabilities. So what do you think, Erin? ERIN WALSH: Sounds really awesome. And it sounds like our third persona, Chris, could take a beat from all those cool customers. Our third developer, Chris, is an expert AI engineer. He uses custom models often built using classical machine learning architectures and popular frameworks like PyTorch, JAX, and Keras. He wants to be able to run his models on device and optimize every ounce of performance to meet strict latency, memory, and power constraints. So for Chris, the core LiteRT framework is his ultimate tool. But what exactly is Chris building for? Chris is an environmental conservationist. And he's building a suite of offline tools for field researchers. His team's goal is to monitor the migration patterns of a rare bird species deep within the rainforest. And for this, he's building a continuous acoustic monitor designed to sit in the canopy and identify specific bird calls. So Chris is building a custom audio classification model from scratch. He needs it to surgically separate the specific call of the rare bird from the chaotic background noise of the jungle. And because the device is solar powered and constantly running, he can't use an LLM for this because it would drain the tiny battery in hours. So Chris builds his own PyTorch model and tweaks it until it correctly identifies the rare bird calls. But then, he faces another challenge. How does he actually run his model on a low-power IoT device sitting offline in the woods? To streamline his workflow, Chris uses the LiteRT CLI tool, a Python-based command line toolkit designed specifically for managing, running, and benchmarking LiteRT models on different hardware, accelerators, and platforms. And it also has a custom skill that Chris can plug directly into his coding agent of choice, like Google Antigravity. By handing the terminal commands off to his agent, he can create a more seamless, modern developer experience. So here's how he takes the model from repository to hardware accelerated workflow-- for hardware accelerated deployment. First, Chris uses the LiteRT CLI tool to convert his PyTorch model into the highly optimized, flexible TFLite format. And then, he can perform ahead-of-time compilation on his model to target a particular NPU. By targeting NPUs, he can get maximum battery life from his monitoring device. And finally, he can benchmark the model's performance on a connected device. And then next, it's time for deployment. To get the absolute best performance out of his IoT hardware, Chris uses the LiteRT compiled-model API. This API is a game changer for edge development and deployment. It automatically handles hardware acceleration, seamlessly compiling and routing the model to the device's NPU. This ensures the model runs blazingly fast while just using a fraction of the power. But Chris's work isn't done yet. He's also building a mobile companion app for the human scientists on the ground. Out in the field, the researchers are wearing thick protective gloves. And they don't have any cell service. But they still need to quickly record their notes and findings. They need an offline, voice-to-text transcription feature. So this time, Chris doesn't need to build his own model. He can leverage existing, pre-built AI. He heads over to the automatic speech recognition leaderboard on Hugging Face and decides to pick the Parakeet model to balance quality and speed. And he integrates it as an off-the-shelf solution to power his app's hands-free dictation. LiteRT already provides a few ready-to-use, pre-optimized ASR models like Parakeet, Whisper, Qwen ASR, and Moonshine that support a variety of on device applications. There is also a prebuilt sample app to run these models. And so it's easy for Chris to directly download his model of choice from the LiteRT Hugging Face community and write it on his device using the sample app. In fact, let's try this very simple app out right now. So can we show the phone on the screen? Oh, we're still on Gemma chat. So I'm going to pull up ASR app. We're going to choose Parakeet. It's going to load. And we're going to run it on TPU. All right. So now, let's load it up. Let's test a bit of live transcription. Hello. We are testing audio speech recognition live on stage right now. All right. So I think there's a little of interference. But it works. We can-- and you can use this. You can see it on GitHub and download and try out the models for yourself and run them on different hardware. So you can go back to the slides. And so to conclude Chris's journey, we've seen how converting a custom model and leveraging the latest ASR technology allowed him to build the perfect offline solution for research in the rainforest with LiteRT. So now, we've seen how three different developers navigated the Google AI stack to tackle vastly different technical challenges. Whether you're deploying on-device LLMs or dropping in pre-built AI features or engineering a completely custom AI pipeline, the Google AI Edge ecosystem provides the tools you need to bring your vision to life. So, Sachin, what should we know going forward? SACHIN KOTWANI: All right. So one of the most important things for all of you to remember is that everything that we showed you here is running on the device. There's no internet connection required for any of these capabilities, even the large models that we're running that we demoed, including the Ask Image feature. So all of that is pretty impressive. So to recap, we started with the Gallery app where you can see what's possible by running Gemma on device. The app is available for you to download on Android and iOS. And you can even download the source code from GitHub, fork it, modify it, use it to build your own apps. Next, we covered MediaPipe tasks for ready-to-use tasks for various audio, vision, and language capabilities. Really easy to implement and use. We touched on how LiteRT-LM powers the LLMs on device. And finally, we talked about how you can use LiteRT to bring your own custom models. And now, it's your turn. We really look forward to seeing all the great things you build with this rich stack. And if you have any questions for us, we'll be outside at the AI Q&A table from 1:00 to 3:00 PM. Thank you so much for attending. [MUSIC PLAYING]

More from Google