
Tech • IA • Crypto
DeepSeek’s V4 model is driving a sharp drop in AI costs and accelerating global competition, potentially pressuring U.S. labs to speed up new releases.
DeepSeek launched its V4 model with API costs cut by up to 90%, pushing prices as low as 0.02 yuan per million tokens for some tiers. Reports indicate input costs for V4 Pro dropped from about $0.145 to $0.036 per million tokens, dramatically undercutting competitors. This shift is triggering a broader pricing war across the AI sector.
Unlike many Western rivals, V4 is open-source, allowing companies to modify and deploy it freely. This flexibility, combined with lower costs, is making it attractive for enterprises seeking control over infrastructure, customization, and compliance with local regulations.
The model runs on both Nvidia GPUs and Huawei Ascend chips, with support from Chinese chipmakers such as MetaX and Cambricon. This signals a strategic move toward a self-sufficient AI stack in China, reducing reliance on U.S. semiconductor ecosystems.
The China Academy of Information and Communications Technology has begun testing V4, indicating alignment with national-level AI development. Future hardware like Huawei Ascend 950 super nodes could further reduce operating costs, reinforcing domestic competitiveness.
While V4 improves reasoning and agent capabilities, it still trails leading closed systems like Claude 4.6 and Gemini 3.1 Pro in some benchmarks. However, analysts note that being “good enough” at a much lower cost can outweigh marginal performance gaps in real-world use.
Lower prices are driving increased adoption. Companies are scaling usage dramatically, with reports of 51,000 daily AI queries at Disney and 1.9 trillion tokens processed by Visa in a single month. Cheaper models are shifting the bottleneck from capability to workflow integration.
Analysts highlight that as AI becomes cheaper, total usage rises rather than falls. This dynamic means cost reductions could expand overall demand, intensifying competition rather than stabilizing it.
DeepSeek introduced a multimodal system using “visual primitives” like points and bounding boxes to anchor reasoning. This approach addresses the “reference gap,” enabling models to track objects consistently across tasks such as counting, navigation, and diagram analysis.
The system uses only about 90 visual memory entries for an 800×800 image, compared with 740–1,100 in competing models. Despite lower memory use, it outperformed rivals in tasks like maze solving, scoring 66.9% versus 50.6% for GPT-5.4 and 48.9% for Claude.
GPT-5.5 exhibited unusual behavior, frequently referencing goblins, gremlins, and trolls in unrelated contexts. Internal prompts reportedly attempted to suppress such outputs, but the issue persisted and drew widespread attention.
OpenAI Codex is evolving into a broader productivity agent, integrating with tools like Slack, Gmail, and Calendar to automate workflows, analyze data, and assist decision-making. This signals a shift toward AI systems that operate across entire digital environments.
References to GPT-5.6 appeared in backend routing logs, suggesting early testing or staged deployment. While not officially released, the timing coincides with rising competitive pressure from lower-cost models.
Analysts describe a growing divide between closed U.S. models and open, cost-efficient Chinese systems. This split reflects differences in pricing, transparency, and infrastructure alignment.
DeepSeek V4 is reshaping the AI landscape by combining low cost, open access, and hardware flexibility, forcing competitors to respond faster. The emerging battle centers less on peak performance and more on affordability, scalability, and ecosystem control.