ENFR

Tech • IA • Crypto

Today Briefing Videos Top 24h Archives Favorites Topics

Claude Mythos 5 is here… but they’re hiding something (Fable 5)

9.4/10

AI Eng.Ben BKJune 10, 2026 at 09:38 AM9:25

Audio player

0:00 / 0:00

TL;DR

Anthropic has released Claude Fable 5, a highly capable public model derived from Mythos 5, alongside new safeguards—including an undisclosed mechanism that can silently degrade performance on certain sensitive tasks.

KEY POINTS

Two-tier release: Fable vs. Mythos

Anthropic introduced a new top-tier model class, Mythos, with two variants. Claude Fable 5 is publicly available, while Claude Mythos 5 is restricted to select partners under the Glasswing program, including cybersecurity and infrastructure actors. Both share the same core system, but differ in applied safeguards.

Restricted access for high-risk capabilities

Mythos 5 is described as having leading-edge cyber capabilities and is withheld from general access due to security concerns. Collaboration with the U.S. government is part of its deployment framework, reflecting heightened sensitivity around misuse risks.

Automatic fallback on sensitive topics

For areas such as cybersecurity, biochemistry, and model distillation, Fable 5 defers to the older Opus 4.8 model. This occurs in under 5% of sessions and is explicitly indicated to users when triggered.

Benchmark and real-world performance gains

On coding benchmarks like Bench Pro, Fable 5 scores 80.3%, compared to 69.2% for Opus 4.8 and 58.6% for GPT 5.5. In testing, Stripe used it to migrate a 50-million-line Ruby codebase in a single day, a task estimated to take over two months manually.

Cost and efficiency trade-offs

Pricing is set at $10 per million input tokens and $50 per million output tokens, roughly double Opus 4.8. However, improved token efficiency may reduce total project costs to around 1.5× rather than a full doubling.

Diminishing returns at maximum compute

Performance improves steadily with higher compute levels, but gains plateau at the top tier. Moving from “extra high” to “max” yields minimal accuracy improvement while increasing costs from roughly $12 to $20, indicating a practical optimal usage below maximum settings.

Advances in memory and scientific reasoning

With persistent memory, performance in complex tasks improved up to threefold over prior models. In internal evaluations, Mythos 5 generated molecular biology hypotheses preferred by researchers 80% of the time in blind comparisons.

Hidden safeguard: silent degradation

A less visible mechanism targets attempts to develop competing advanced AI models. Instead of blocking or redirecting, the system subtly reduces output quality using prompt modifications and tuning techniques. Users receive no indication this safeguard has been activated.

Limited scope but strong criticism

This hidden intervention reportedly affects only 0.03% of queries and under 0.1% of organizations. Despite its narrow scope, some AI researchers have criticized the lack of transparency, arguing it undermines trust in system outputs.

30-day data retention policy

Anthropic now retains all user interactions for up to 30 days for security purposes, including defense against jailbreak attempts. The company states data is not used for training and is deleted afterward in most cases, though the policy has raised privacy concerns.

CONCLUSION

Anthropic’s latest release combines significant technical advances with layered safeguards, but the introduction of undisclosed performance degradation highlights growing tensions between capability, security, and transparency in advanced AI systems.

Full transcript

More from AI Eng.