ENFR

Tech • IA • Crypto

Today Shorts Top Stories Topics All videos YT channels Crypto Archives Favorites

China’s New AI Is 6X More Efficient Than Claude

9.4/10

AIAI RevolutionJune 17, 2026 at 11:42 PM15:51

Audio player

0:00 / 0:00

TL;DR

New open-weight coding models from China are undercutting leading AI systems on price while matching or surpassing them on several benchmarks, intensifying competition across the AI stack.

KEY POINTS

Kimi K2.7 Code targets agent workflows

Moonshot AI released Kimi K2.7 Code, a 1 trillion-parameter mixture-of-experts model optimized for coding agents rather than general use. It activates 32 billion parameters per token across 384 experts, enabling large-scale performance with practical efficiency. The model is designed for multi-step development tasks such as navigating repositories, debugging, testing, and iterating across sessions.

Efficiency gains reduce real-world costs

The model reportedly uses 30% fewer “thinking tokens” than its predecessor, lowering latency and cost in iterative coding workflows. Features like forced thinking and preserved reasoning allow it to maintain context across multiple steps, a key requirement for autonomous coding agents operating over extended tasks.

Competitive benchmarks with notable gaps

K2.7 Code improves significantly over earlier versions, reaching 62.0 on Kimiko Bench V2 and 53.6 on Program Bench, though still trailing GPT-5.5, which scores 69.1 on Program Bench. However, it outperforms Claude Opus 4.8 on MCP Mark Verified (81.1 vs. 76.4), a benchmark reflecting real-world software environments like GitHub and databases.

Aggressive pricing reshapes economics

Kimi K2.7 Code costs $0.95 per million input tokens and $4 per million output tokens, compared with GPT-5.5 ($5/$30) and Claude Opus 4.8 ($5/$25). Cached input drops to $0.19, making it significantly cheaper for repeated workflows. Output pricing is especially critical, as coding agents consume large volumes during generation.

GLM 5.2 pushes long-context engineering

Z.ai introduced GLM 5.2, a 753 billion-parameter open-weight model focused on long-horizon coding tasks. It supports a 1 million token context window and introduces optimizations like index sharing, reducing compute costs by up to 2.9× at maximum context length.

Benchmark wins over GPT-5.5

GLM 5.2 outperforms GPT-5.5 on several major tests, including SWE-Bench Pro (62.1 vs. 58.6) and Frontier SWE (74.4% vs. 72.6%), while remaining competitive with Claude Opus 4.8. It also scores higher on tool-enabled reasoning benchmarks, indicating strength in complex, multi-step engineering scenarios.

Low pricing with full open access

Priced at $1.40 input / $4.40 output per million tokens, GLM 5.2 is roughly one-sixth the cost of GPT-5.5. It is released under a permissive MIT license, allowing enterprises to run, fine-tune, and deploy the model independently, avoiding vendor lock-in or geopolitical restrictions.

Geopolitics amplify open-model appeal

Recent export controls led to certain advanced models being withdrawn globally, raising concerns about access stability. Open-weight alternatives like GLM 5.2 provide operational independence, allowing companies to maintain continuity regardless of policy changes.

Cursor acquisition highlights strategic value of coding tools

Reports indicate SpaceX may acquire Cursor developer AnySphere in a $60 billion all-stock deal. Cursor has reached a $4 billion annualized revenue run rate, representing rapid growth and roughly 21% of SpaceX’s 2025 revenue.

Compute and data seen as key drivers

Integration with large-scale infrastructure, including reported access to hundreds of thousands of GPUs, could accelerate Cursor’s capabilities. The platform also generates valuable data on real-world coding workflows, which can be used to improve AI coding systems.

OpenAI prepares next-generation voice model

OpenAI is developing GPT-BD1, a bidirectional voice system designed for more natural conversation. The model is expected to support simultaneous listening and speaking, smoother interruptions, and adjustable reasoning levels, addressing limitations in current voice assistants.

CONCLUSION

Open-weight models are rapidly closing the gap with proprietary systems on both performance and cost, while strategic moves in infrastructure and interfaces signal intensifying competition across the AI ecosystem.

Full transcript

More from AI