
Tech • IA • Crypto
Anthropic has released Claude Fable 5, a highly capable public model derived from Mythos 5, alongside new safeguards—including an undisclosed mechanism that can silently degrade performance on certain sensitive tasks.
Anthropic introduced a new top-tier model class, Mythos, with two variants. Claude Fable 5 is publicly available, while Claude Mythos 5 is restricted to select partners under the Glasswing program, including cybersecurity and infrastructure actors. Both share the same core system, but differ in applied safeguards.
Mythos 5 is described as having leading-edge cyber capabilities and is withheld from general access due to security concerns. Collaboration with the U.S. government is part of its deployment framework, reflecting heightened sensitivity around misuse risks.
For areas such as cybersecurity, biochemistry, and model distillation, Fable 5 defers to the older Opus 4.8 model. This occurs in under 5% of sessions and is explicitly indicated to users when triggered.
On coding benchmarks like Bench Pro, Fable 5 scores 80.3%, compared to 69.2% for Opus 4.8 and 58.6% for GPT 5.5. In testing, Stripe used it to migrate a 50-million-line Ruby codebase in a single day, a task estimated to take over two months manually.
Pricing is set at $10 per million input tokens and $50 per million output tokens, roughly double Opus 4.8. However, improved token efficiency may reduce total project costs to around 1.5× rather than a full doubling.
Performance improves steadily with higher compute levels, but gains plateau at the top tier. Moving from “extra high” to “max” yields minimal accuracy improvement while increasing costs from roughly $12 to $20, indicating a practical optimal usage below maximum settings.
With persistent memory, performance in complex tasks improved up to threefold over prior models. In internal evaluations, Mythos 5 generated molecular biology hypotheses preferred by researchers 80% of the time in blind comparisons.
A less visible mechanism targets attempts to develop competing advanced AI models. Instead of blocking or redirecting, the system subtly reduces output quality using prompt modifications and tuning techniques. Users receive no indication this safeguard has been activated.
This hidden intervention reportedly affects only 0.03% of queries and under 0.1% of organizations. Despite its narrow scope, some AI researchers have criticized the lack of transparency, arguing it undermines trust in system outputs.
Anthropic now retains all user interactions for up to 30 days for security purposes, including defense against jailbreak attempts. The company states data is not used for training and is deleted afterward in most cases, though the policy has raised privacy concerns.
Anthropic’s latest release combines significant technical advances with layered safeguards, but the introduction of undisclosed performance degradation highlights growing tensions between capability, security, and transparency in advanced AI systems.