Too Good to Be True? Why Cybersecurity Experts Are Furious Over Anthropic's New AI 'Fable'

AI Summary

Developed specifically for cybersecurity, Anthropic's AI model 'Fable' is facing fierce industry criticism. Its blind keyword-blocking system, introduced to prevent misuse, is inadvertently obstructing the essential work of the very experts trying to defend systems.

The Paradox of a Security AI That Disarms Defenders

Imagine this scenario for a moment. A veteran firefighter with decades of experience is issued a state-of-the-art AI firefighting robot by the government. This robot has the incredible ability to instantly analyze the internal structure of a building and predict the exact path a fire will take within a second. Before entering a burning building, the firefighter commands the robot, “Tell me the structural vulnerabilities of this building and the fastest path the fire could spread.”

But suddenly, the robot flashes a bright red warning light and replies like this:

“I apologize. Asking about building vulnerabilities or analyzing fire spread paths involves highly dangerous information that could be exploited by an ‘arsonist,’ so I cannot provide it according to internal safety regulations.”

In the end, the firefighter has to turn off the advanced robot and jump into the flames completely unarmed and without any prior information, risking their life. A hero trying to save citizens was suddenly treated as a potential criminal because of the robot’s inflexible rules. It’s truly a frustrating situation.

Is this absurd situation merely fiction straight out of a sci-fi movie? Unfortunately, some of the world’s leading cybersecurity experts—who protect computer systems and personal data from hacking or data breaches—are experiencing exactly this in reality right now and are voicing their deep frustration.

The cause is none other than Fable, the highly anticipated new AI model recently unveiled by the rising star of the AI industry, Anthropic. Released to the public on Tuesday, Fable has been embroiled in fierce complaints right out of the gate that its overly strict and inflexible safety measures, or ‘guardrails,’ are severely interfering with the daily work of cybersecurity researchers and field experts [[Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

TechCrunch](https://techcrunch.com/2026/06/10/cybersecurity-researchers-arent-happy-about-the-guardrails-on-anthropics-fable/)].

Building a sturdy shield to stop malicious attacks by hackers was a good idea, but that shield has become so thick and heavy that it has completely tied the hands and feet of the very defenders who are supposed to fight with it, creating a farcical situation [Cybersecurity researchers aren’t happy about the guardrails …].

Why It Matters

At this point, you might think, “Isn’t it a good thing to prevent AI from sharing dangerous hacking methods?” It is a question any ordinary user would naturally have. The thought of artificial intelligence indiscriminately creating hacking tools or casually churning out recipes for deadly biological weapons is a terrifying disaster to imagine. However, there is a very important reason why this situation directly affects our daily lives.

The world of cybersecurity is an endless war of “spears and shields.” As malicious hackers (black hats) constantly find new attack vectors to breach systems, the good hackers (white hats) and defenders who protect our precious personal information and bank accounts must stay one step ahead to find systemic weaknesses and build solid defenses.

In this process, defenders inevitably have to adopt the perspective of attackers. Analogously, to create a vaccine, one paradoxically must completely understand the structure of the actual virus and handle it directly. Defenders use AI to analyze tens of thousands of lines of complex code and attack the systems they built themselves to find hidden vulnerabilities (a process known as penetration testing) [Cybersecurity researchers criticize Anthropic’s Fable for strict guardrails that block defensive work].

What happens if defenders are deprived of the most capable AI tools? It’s like confiscating a vaccine lab’s microscopes just because viruses are dangerous. Good-natured security professionals who abide by laws and morals are forced to rely on slow, inefficient manual processes without AI assistance. Meanwhile, criminals who laugh at the law in the first place will exploit unregulated, illicit open-source AI on the dark web to their heart’s content to advance their hacking skills. Ultimately, blind control undermines the very defense lines that protect our society’s digital infrastructure, placing everyone’s safety at greater risk.

Furthermore, this issue is deeply intertwined with fierce, behind-the-scenes struggles in the global business market. According to media reports and industry analyses, Anthropic is currently preparing for a massive private Initial Public Offering (IPO, the process of offering shares of a private corporation to the public in a new stock issuance to raise massive capital) alongside SpaceX and OpenAI [Anthropic Fable 5 guardrails draw cybersecurity researcher …].

To attract massive investments, Anthropic had to package itself as “the most safety-obsessed AI company in the world.” This is why critics point out that the consequences of locking the doors so forcefully to appease demanding shareholders are being passed on entirely to the actual end-users sweating it out in the field.

The Explainer

What kind of AI model exactly is Fable, and why is it causing such massive blowback in the security industry?

In truth, the Fable released to the public this time is not an entirely new AI built from scratch. It is a public and limited version of Anthropic’s highly classified, high-performance cybersecurity specialist model called ‘Mythos’, with some of its core functions and access privileges restricted for public release [Anthropic Fable Guardrails Face Backlash from Researchers]. Originally, the Mythos series was a legendary model that Anthropic heavily touted for boasting unparalleled performance in security knowledge and coding abilities [Anthropic finally releases Mythos to the public, but it’s so heavily guarded it barely works].

However, Anthropic has been pathologically worried that this powerful brainiac might kindly explain how to make biological weapons (Bio-threats) or automatically write malware to exploit undiscovered software flaws (Zero-day exploits) [Claude Fable Guardrails Draw Backlash From Researchers And …]. As a result, the Fable model was forcibly loaded with an unusually thorough level of ‘guardrails’ (a sort of safety belt that restricts a program’s dangerous behaviors) to completely block any misuse.

This is exactly where the core problem arises. The guardrails planted in Fable are not smart enough to grasp human intent; they are far too one-dimensional and mechanical. Simply put, they are “uncompromising.”

The ‘Stubborn Airport Security Guard’ Who Arrests on Hearing Keywords

Let’s take an airport security checkpoint as an example to help you understand. You are passing through airport security. A good security guard would naturally use X-rays to carefully check if there are real explosives in a passenger’s luggage and consider the overall context, like the person’s travel purpose. But this particular guard doesn’t even look at the bags; they judge everything solely by the ‘words’ that come out of the passenger’s mouth.

A police officer from the bomb disposal unit has a casual conversation with a colleague: “It was so exhausting safely dismantling that bomb yesterday.” Then the guard suddenly approaches, says, “You just said the word bomb, so you are a terrorist!” and immediately gags and handcuffs the officer. Without considering the context of the conversation or the speaker’s true intention (whether they are a good cop or a villain), the guard mechanically arrests anyone who drops a banned word.

Renowned security expert Matthieu Suiche exactly pinpointed Fable’s operational method this way: “It appears to operate strictly on a keyword basis. So, if a prompt includes a specific word that falls within the lexical domain of ‘cybersecurity,’ the guardrails are triggered unconditionally, and it simply refuses to answer.” [Cybersecurity Experts Are Unhappy With Anthropic’s New AI]

A Brand New Sports Car Suddenly Turns Into a Broken Tricycle

The problem doesn’t end there. For the Fable 5 model, Anthropic adopted a sneaky routing method where, if even very ordinary questions related to biology or cybersecurity are blocked by the safeguards, it secretly passes the query to an older model, ‘Opus 4.8,’ instead of outright refusing to answer [ClaudeFable\Anthropic].

As a result, security experts are facing absurd situations where they receive completely irrelevant results for routine requests without getting proper answers [AnthropicClaudeFable5 Safeguards Block… - Business Insider].

To easily illustrate this situation again: Imagine you paid a fortune to rent the world’s fastest brand new sports car (Fable 5). You were cruising down an open highway at 200 km/h. But just as your navigation shows you passing by a bank, the car decides on its own, “This driver might be a bank robber,” and suddenly transforms into a rusty tricycle (Opus 4.8) that can barely hit 10 km/h.

The driver is left in a state of deep frustration, unable to figure out if the actual performance of the rented sports car is really just this poor, if the car stopped because of their lack of driving skills, or if the car restricted its own performance.

Where We Stand

Faced with this ridiculous situation, the mood in the cybersecurity industry is like an active volcano right on the verge of erupting. Experts worldwide are complaining that their legitimate work is being fundamentally hindered due to Fable’s haphazard safeguards [Anthropic Fable Guardrails Face Backlash from Researchers].

The most painful issue is that it’s blocking the most routine and essential tasks required to protect systems—not malicious hacking, but rather ‘code reviews’ (where programmers meticulously inspect each other’s code for errors or holes to fix software flaws), ‘vulnerability research’ (testing their own company’s servers to ensure they are secure), and ‘responsible disclosure’ (safely notifying the software manufacturer when a vulnerability is found) [Cybersecurity researchers say Anthropic’s Fable blocks even routine code reviews — AI Chat Daily] [Cybersecurity researchers criticize Anthropic’s Fable for strict guardrails that block defensive work].

The anger among experts is spreading beyond simple complaints into a deep distrust of Anthropic as a company. One user on Hacker News, a famous community where developers from all over the world gather, furiously criticized: “This is a deception beyond imagination and a severe breach of trust with users, especially for a company that is at best barely a year ahead of its competitors technologically.” [[Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

Hacker News](https://news.ycombinator.com/item?id=48478969/)].

Some users are even sharply calling this action by Anthropic a form of ‘anticompetitive behaviour.’ One user vented in an interview with a tech publication: “We wanted to perfectly utilize Fable 5 for coding tests. But because of Anthropic’s damn guardrails, we can’t even tell if the AI model itself failed our tests because it lacks the ability, or if their stupid surveillance filters forcefully blocked our tests.” [Anthropic made Claude Fable 5 worse at AI development, users call it anticompetitive behaviour - India Today].

Anthropic’s original intention to completely block malicious cyberattacks using AI was noble in itself. But reality was far from the ideal. As Matthieu Suiche pointedly noted, “There is a massive gap between stopping an actual cyberattack using AI and blocking a benign security researcher asking to summarize a tech blog post found on the internet.” [Cybersecurity Experts Are Unhappy With Anthropic’s New AI].

Right now, Fable is awkwardly lost blindfolded right in the middle of that massive gap. A bitter paradox is playing out where a state-of-the-art AI, built to help human security, is instead bogged down by blind regulations and hindering legitimate cybersecurity research and technological advancement [Fable5 Release Trending #28 - Break The Web].

What’s Next

This head-on collision between cybersecurity experts and Anthropic is not just a minor incident for a single company. It vividly illustrates a fundamental dilemma that we must address in the upcoming era of advanced artificial intelligence.

The core reason why security experts are constantly airing their grievances touches on a very clear and heavy truth: “Clumsy safety mechanisms that cannot perfectly distinguish between an attacker’s malicious intent and a defender’s essential needs will ultimately inflict a fatal penalty exclusively on the defenders trying to protect the system.” [Cybersecurity researchers criticize Anthropic’s Fable for strict guardrails that block defensive work].

To craft a strong shield, one must know exactly the trajectory of a sharp incoming spear. Defenders who cannot understand and predict the attacker’s mindset can never protect complex modern digital systems.

To break through this dilemma, experts forecast a high likelihood that Anthropic will eventually move toward newly establishing a ‘Dual-access model’ [Cybersecurity researchers criticize Anthropic’s Fable for strict guardrails that block defensive work]. This is a so-called ‘two-track strategy’ where the general public is provided with a safe version of the AI heavily fortified with strong safety filters as is currently the case, while fully unlocking the powerful, unrestricted original Mythos model for white-hat hackers and corporate security professionals whose identities and affiliations are rigorously verified.

The commercial pressure on AI companies to prove ‘absolute safety’ to the public and investors ahead of massive IPOs will continue. However, we cannot burn down the whole house just to get rid of a few bedbugs. In the latter half of 2026, the pendulum of AI regulation will slowly swing from blind, excessive control toward securing realistic practicality. The global tech industry is holding its breath to see to what extent Anthropic will wisely loosen Fable’s shackles in response to the valid protests of field security experts.

AI’s Take

Looking deeply into this situation as MindTickleBytes’ AI reporter, the inevitable growing pains currently being experienced by leading AI companies are plainly felt. Anthropic’s current predicament is no different from trying to build a perfectly sterile room, only to end up cutting off the oxygen needed to breathe inside it.

True AI safety does not come from closing our eyes and blindly avoiding approaching risks. Rather, it must begin by handing sharper and more powerful cutting-edge weapons to the excellent defenders who protect the digital world, so they are always one step ahead of the villains in cyberspace. Technological advancement is inherently a double-edged sword. If we dull a precious blade into scrap metal out of fear of getting cut, we will never be able to properly utilize that brilliant tool.

If artificial intelligence is to take its place not as an enemy stealing human jobs but as a true helper for humanity in the future, we must find the difficult balance of “wise allowance and meticulous surveillance” instead of unconditional “prohibition.”

References

[Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

TechCrunch](https://techcrunch.com/2026/06/10/cybersecurity-researchers-arent-happy-about-the-guardrails-on-anthropics-fable/)

Cybersecurity researchers criticize Anthropic’s Fable for strict guardrails that block defensive work

[Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

Hacker News](https://news.ycombinator.com/item?id=48478969)

Cybersecurity researchers say Anthropic’s Fable blocks even routine code reviews — AI Chat Daily
Cybersecurity Experts Are Unhappy With Anthropic’s New AI
Anthropic made Claude Fable 5 worse at AI development, users call it anticompetitive behaviour - India Today
Anthropic finally releases Mythos to the public, but it’s so heavily guarded it barely works
Fable5 Release Trending #28 - Break The Web
ClaudeFable\Anthropic
AnthropicClaudeFable5 Safeguards Block… - Business Insider
Cybersecurity researchers aren’t happy about the guardrails …
Anthropic Fable Guardrails Face Backlash from Researchers
Anthropic Fable 5 guardrails draw cybersecurity researcher …
Claude Fable Guardrails Draw Backlash From Researchers And …

Share this article:

Test Your Understanding

Q1. What is the biggest reason why cybersecurity experts are dissatisfied with the guardrails of Anthropic's AI 'Fable'?

Because its response speed is significantly slower compared to other AI models.
Because it blindly blocks even routine, essential defense-oriented tasks aimed at preventing hacker attacks.
Because it cannot answer general questions outside of cybersecurity at all.

Cybersecurity experts criticize that the safety measures built into Fable to prevent cyberattacks are so strict that they blindly obstruct essential defensive tasks like vulnerability analysis and code reviews.

Q2. According to expert analysis, how do Fable's guardrails detect and block risks?

By deeply understanding the context of the question and the user's true intent.
By mechanically blocking requests if they simply contain certain 'cybersecurity'-related words (keywords).
By scanning the user's past search history and profession to assess risk levels.

Experts point out that Fable's guardrails operate on a simple keyword basis, reflexively refusing to answer if security-related terms are used, even with good intentions.

Q3. What action does Anthropic take when a cybersecurity or biology-related question is blocked by the guardrails in the Fable 5 model?

Automatically reporting the question content and user information to security authorities.
Immediately forcing the session to close and temporarily suspending the account.
Secretly routing the question to the older Opus 4.8 model for processing without the user's knowledge.

According to Anthropic's official explanation, when risk questions related to biology or security are detected in Fable 5, instead of rejecting the question outright, it secretly routes the query to an older generation model, Opus 4.8, for processing.