ENFR

Tech • IA • Crypto

Today Shorts Top Stories Topics All videos YT channels Crypto Archives Favorites

Google’s New AI Just Broke the AI Speed Limit: DiffusionGemma

9.4/10

AIAI RevolutionJune 13, 2026 at 12:06 AM14:45

Audio player

0:00 / 0:00

TL;DR

Google unveiled a faster diffusion-based text model and real-time voice translation, Xiaomi launched a memory-driven coding agent, and OpenAI moved closer to a potential $1 trillion IPO.

KEY POINTS

Google introduces Diffusion Gemma for faster AI text generation

Google released Diffusion Gemma, an experimental open model that generates text by iteratively refining a full block rather than producing tokens sequentially. Built on the Gemma 4 family with a 26B parameter mixture-of-experts architecture, it activates roughly 3.8B parameters per inference. The model operates on a 256-token canvas, allowing it to revise earlier text as meaning evolves, improving coherence in structured tasks.

Performance gains target local and interactive use

The model achieves up to 4× faster generation on dedicated GPUs, exceeding 1,000 tokens per second on Nvidia H100 and 700+ on RTX 5090. Quantized versions can run in about 18GB VRAM, making high-end local deployment viable. Google positions it for interactive workflows such as editing, OCR, code infilling, and agent systems rather than top-tier writing quality.

Diffusion approach shows strengths in structured reasoning

Unlike left-to-right models, Diffusion Gemma can adjust entire outputs mid-generation, benefiting tasks where global consistency matters. In testing, a base version initially solved 0% of Sudoku puzzles, but after fine-tuning reached around 80% accuracy, highlighting the advantage of holistic reasoning over sequential prediction.

Broad ecosystem support and open release

Released under Apache 2.0, the model integrates with tools like Hugging Face, MLX, vLLM, Transformers, Nvidia NeMo, and others. It supports NVFP4 4-bit precision for near-lossless acceleration and is optimized for hardware ranging from consumer GPUs to DGX systems, with expanding compatibility including llama.cpp.

Gemini 3.5 Live Translate enables real-time speech translation

Google also launched Gemini 3.5 Live Translate, delivering near real-time speech-to-speech translation across 70+ languages. The system translates while speakers are still talking, maintaining tone, pacing, and pitch. It supports over 2,000 language combinations in meetings without routing through a single intermediary language.

Integration across apps and enterprise tools

The feature is rolling out via the Gemini Live API, Google AI Studio, and Google Meet, alongside consumer availability in the Google Translate app on Android and iOS. It is designed for noisy environments and includes features like headphone playback and phone “earpiece mode” for natural conversations.

Xiaomi enters AI coding with memory-focused agent

Xiaomi’s MIMO AI team released MIMO Code v0.1.0, an open-source terminal-based coding agent under an MIT license. It emphasizes persistent memory through structured logs, checkpoints, and SQLite-based retrieval, addressing a key limitation where agents lose context in long sessions.

Benchmarks highlight gains in long-running tasks

Xiaomi reports 82% on SWE-bench Verified and improved performance over competitors in extended workflows, with win rates exceeding 65% after 200 steps in developer tests. The system uses dual agents—one coding and one logging progress—and includes features like workflow distillation and voice control.

Aggressive pricing and ecosystem strategy

MIMO models offer low-cost access, starting at $0.40 per million input tokens, undercutting rivals such as GPT-5.5 and Claude. The platform supports multiple model APIs and aligns with a broader trend among Chinese firms toward open tools and competitive pricing.

OpenAI advances toward potential IPO

OpenAI confidentially filed for a U.S. IPO, reportedly targeting a valuation of up to $1 trillion. The company has seen rapid growth, with 900 million weekly users, 50 million subscribers, and approximately $2 billion in monthly revenue, though profitability is not expected before 2030.

CONCLUSION

The latest developments highlight a shift toward faster, more interactive AI systems, alongside intensifying competition in coding tools and mounting financial momentum as leading firms position for public markets.

Full transcript

More from AI