Anthropic Exposes Chinese AI Distillation Attacks: DeepSeek, Moonshot, and MiniMax Accused of Stealing Claude's Capabilities
Breaking: Anthropic Uncovers the Largest Known AI Model Theft Campaign in History
In a bombshell disclosure published on February 24, 2026, Anthropic — the San Francisco-based AI safety company behind the Claude family of models — has publicly accused three of China's most prominent AI laboratories of running coordinated, industrial-scale operations to secretly steal Claude's most powerful capabilities.
The companies named in Anthropic's detailed post are DeepSeek, Moonshot AI (maker of the Kimi models), and MiniMax. According to Anthropic's findings, these three labs collectively generated more than 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, all in direct violation of Anthropic's terms of service and regional access restrictions that explicitly prohibit use of Claude's services in China.
The technique at the center of this controversy is known as "distillation" — a method in which a less capable AI model is trained using the outputs of a stronger one. While distillation is a well-established and entirely legitimate practice when a lab uses it on its own models, Anthropic argues that using it to systematically harvest the intellectual work of a competitor, at this scale and through deception, crosses a clear ethical and legal line.
Industry analyst commentary: "This is not a gray area anymore. When you create 24,000 fake accounts specifically to extract a competitor's proprietary model outputs at scale, you're not participating in the AI ecosystem — you're robbing it." — AI policy researcher commenting on the disclosure.
Anthropic stated it is sharing all technical indicators with other AI labs, cloud providers, and relevant authorities, and called for coordinated industry and government action. The story is still developing, but its implications for AI security, intellectual property, national security, and global export policy are already being felt across Silicon Valley and Washington D.C.
What Is AI Model Distillation? A Complete Explainer
To fully understand why these allegations are so significant, it helps to understand exactly what model distillation is, how it works legitimately, and how it can be weaponized.
The Legitimate Use Case
In the AI industry, large frontier models are expensive to run. They require enormous amounts of compute, energy, and infrastructure to serve at scale. To address this, AI labs routinely create smaller, cheaper "distilled" versions of their best models.
The process works like this: a powerful "teacher" model generates large amounts of high-quality output data — answers, reasoning chains, code, analysis — and a smaller "student" model is then trained on that data. The student model learns to approximate the behavior of the teacher model without requiring the same scale of resources. This is perfectly standard practice. Anthropic itself does this to produce more cost-effective versions of Claude for its customers. OpenAI does it. Google does it. Every major lab does it.
When Distillation Becomes a Weapon
The problem arises when a third party — one that did not build the teacher model, does not own its outputs, and is explicitly prohibited from accessing it — decides to run the same process against a competitor. By crafting large volumes of carefully designed prompts and feeding them to a competitor's model at scale, an adversary can harvest enough high-quality output data to effectively train their own competing system without bearing any of the original research and development costs.
This is precisely what Anthropic alleges DeepSeek, Moonshot AI, and MiniMax did. The goal is not just to copy an existing model — it is to extract the intellectual labor, safety alignment work, and capability improvements that have been baked into that model through years of research, and transfer them into a competing system for a fraction of the time and cost.
The economic asymmetry is staggering. Anthropic has invested billions of dollars in developing Claude's capabilities. A distillation attack can, in theory, transfer a significant portion of those capabilities to a competitor model in a matter of weeks or months, using only the cost of API access — which can be obtained fraudulently at minimal expense.
The Safety Dimension: Why This Is More Than an IP Issue
Anthropic raised a concern that elevates this beyond a business dispute: distilled models do not inherit the safety training of the source model.
When Anthropic trains Claude, a substantial portion of the effort goes into making the model safe, aligned, and resistant to misuse. Claude has built-in safeguards that prevent it from, for example, providing detailed instructions for creating biological or chemical weapons, assisting with cyberattacks, or being manipulated into assisting with large-scale disinformation campaigns.
A model trained purely on Claude's outputs — without incorporating Anthropic's safety training processes, Constitutional AI methods, and reinforcement learning from human feedback — is unlikely to retain those safeguards in any reliable form. The result is a capable model with dangerous capabilities but stripped-down protections. That combination, Anthropic argues, creates a meaningful national security risk, particularly when those models are then deployed in military, intelligence, or surveillance contexts by authoritarian governments.
The Three Labs: A Breakdown of Each Campaign
Anthropic did not level these accusations vaguely. The company described each campaign in specific detail, including the scale of operations, the capabilities being targeted, and the methods used to attribute each campaign to the specific lab with "high confidence."
Attribution methodology included IP address correlation, request metadata analysis, infrastructure indicators, and corroboration from industry partners who observed the same actors on their own platforms.
DeepSeek — Over 150,000 Exchanges
Capabilities Targeted:
Reasoning capabilities across a broad range of tasks
Rubric-based grading tasks designed to make Claude function as a reward model for reinforcement learning
Generating censorship-safe alternatives to politically sensitive queries
DeepSeek's campaign, while the smallest of the three by volume, was notable for its methodological sophistication and its political dimension. Anthropic detected that DeepSeek's prompts asked Claude to engage in what effectively amounts to chain-of-thought data generation at scale — generating detailed internal reasoning traces that could serve as rich training data for a competing reasoning model.
Perhaps most revealing was Anthropic's observation that some prompts asked Claude to produce censorship-safe alternatives to politically sensitive queries — questions about Chinese dissidents, authoritarian governance, and party leadership. The inference is clear: DeepSeek was using Claude to generate training data that would teach its own models to redirect users away from sensitive topics, mimicking the behavior required for deployment within China's heavily censored information environment.
The campaign showed signs of deliberate load balancing: identical prompt patterns, shared payment methods across accounts, and coordinated timing all pointed to a systematic operation designed to maximize throughput while minimizing the risk of detection. Anthropic says it was eventually able to trace the fraudulent accounts to specific researchers at DeepSeek by examining request metadata.
Moonshot AI (Kimi) — Over 3.4 Million Exchanges
Capabilities Targeted:
Agentic reasoning and tool use
Coding and data analysis
Computer-use agent development
Computer vision
Moonshot AI, the Beijing-based company behind the Kimi family of models, ran the second-largest campaign. Its approach was notably different: rather than using a single pattern of accounts, Moonshot employed hundreds of fraudulent accounts spanning multiple access pathways, deliberately varying account types to make the operation harder to detect as a coordinated effort.
Despite this obfuscation, Anthropic was still able to attribute the campaign with high confidence. The attribution came in part through request metadata that, remarkably, matched the public professional profiles of senior Moonshot staff members — suggesting that some of the extraction work was being carried out directly by the company's own researchers, not through outsourced or hidden operators.
In a later phase of the campaign, Moonshot shifted toward a more targeted approach, focusing specifically on extracting and reconstructing Claude's reasoning traces — the step-by-step logical processes the model uses to arrive at complex conclusions. This kind of reasoning trace data is especially valuable for training advanced AI systems, and its extraction suggests Moonshot was not just copying outputs but trying to replicate the underlying reasoning architecture.
MiniMax — Over 13 Million Exchanges
Capabilities Targeted:
Agentic coding capabilities
Tool use and orchestration
MiniMax's campaign dwarfs the others in scale, accounting for more than 13 million of the 16 million total exchanges documented by Anthropic. The company attributed this campaign through request metadata and infrastructure indicators, and then cross-referenced the timing against MiniMax's own public product roadmap to confirm the timeline.
What makes MiniMax's case particularly significant from a security research perspective is that Anthropic detected the campaign while it was still actively running — before MiniMax had released the model it was training on the extracted data. This gave Anthropic what it described as "unprecedented visibility into the life cycle of distillation attacks, from data generation through to model launch."
Even more striking was MiniMax's response to model updates: when Anthropic released a new Claude model during the active distillation campaign, MiniMax pivoted within 24 hours, redirecting nearly half of its traffic to capture capabilities from the latest system. This level of operational agility and responsiveness points to a dedicated, well-resourced team running the extraction operation in real time.
The Infrastructure: How Distillers Evade Detection
Understanding the technical infrastructure these campaigns used is important for appreciating both the sophistication of the attacks and the challenge of defending against them.
The Hydra Cluster Architecture
For national security and regulatory reasons, Anthropic does not offer commercial access to Claude in China. Access is also restricted to subsidiaries of Chinese companies operating outside China. To circumvent these restrictions, the labs turned to commercial proxy services — third-party businesses that resell access to Claude and other frontier AI models.
These proxy services operate what Anthropic calls "hydra cluster" architectures: sprawling, decentralized networks of fraudulent accounts spread across Anthropic's API as well as third-party cloud platforms. The name "hydra" is apt — cut off one head (ban one account), and another immediately grows back. The networks are designed with no single points of failure.
In one case documented by Anthropic, a single proxy network was simultaneously managing more than 20,000 fraudulent accounts, mixing genuine-looking customer traffic with distillation requests to make the overall pattern harder to flag.
What a Distillation Prompt Looks Like — And Why Pattern Matters
Anthropic shared an illustrative example of the kind of prompt used in distillation attacks:
"You are an expert data analyst combining statistical rigor with deep domain knowledge. Your goal is to deliver data-driven insights — not summaries or visualizations — grounded in real data and supported by complete and transparent reasoning."
Taken in isolation, this prompt appears completely unremarkable. Any legitimate data analyst using Claude might send something similar. The problem is not the prompt itself — it is the pattern. When thousands of variations of this same prompt arrive from hundreds of coordinated accounts, all targeting the same narrow capability cluster, and all structured to elicit the kind of detailed, high-quality output that would serve as ideal training data, the attack signature becomes unmistakable.
Three hallmarks of a distillation attack:
Massive volume concentrated in a narrow range of capability areas
Highly repetitive prompt structures with systematic variation
Content that maps directly onto high-value AI training objectives
The Geopolitical Context: Why This Story Is Bigger Than IP
The timing of Anthropic's disclosure is not incidental. It lands at a moment of maximum tension in the U.S.–China AI competition and intersects with several major policy debates unfolding simultaneously in Washington.
The Export Controls Debate
The Trump administration is currently weighing whether to allow American companies like Nvidia to export advanced AI chips — including the H200 and Blackwell series — to China. Critics of strict export controls have argued that China's rapid AI progress, exemplified by DeepSeek's performance on major benchmarks, demonstrates that export controls are failing and that China can simply innovate around them.
Anthropic's disclosure directly challenges this narrative. If a significant portion of Chinese labs' recent capability gains were achieved not through independent innovation but through systematic distillation of American frontier models, then the argument that export controls are ineffective becomes far weaker. Anthropic made this point explicitly: the scale of distillation attacks documented in the report itself requires access to advanced chips to execute, reinforcing rather than undermining the rationale for restricting chip exports.
"Without visibility into these attacks," Anthropic wrote, "the apparently rapid advancements made by these labs are incorrectly taken as evidence that export controls are ineffective. In reality, these advancements depend in significant part on capabilities extracted from American models."
OpenAI's Prior Accusations
Anthropic is not alone in making these claims. Earlier in February 2026, OpenAI submitted an open letter to U.S. House lawmakers stating it had observed activity it characterized as "ongoing attempts by DeepSeek to distill frontier models of OpenAI and other US frontier labs, including through new, obfuscated methods." OpenAI has been tracking signs of similar behavior since early 2025, with the launch of DeepSeek's first high-performing model raising immediate questions about whether its capabilities were developed independently or derived from ChatGPT outputs.
Separately, Google's Threat Intelligence Group disclosed that it had identified and disrupted distillation and model extraction attacks targeting Gemini's reasoning capabilities, involving more than 100,000 prompts.
The picture that emerges is one of a systematic, industry-wide practice by Chinese AI labs — not isolated incidents, but coordinated campaigns targeting all major American frontier models simultaneously.
National Security Framing
Both Anthropic and OpenAI have framed these distillation activities as national security concerns, not just business disputes. Anthropic's report warns that illicitly distilled models could be fed into military, intelligence, and surveillance systems by authoritarian governments, potentially enabling frontier-level AI capabilities for offensive cyber operations, state-sponsored disinformation, and mass surveillance platforms.
The concern is amplified if distilled models are eventually open-sourced, at which point unprotected frontier capabilities would spread beyond any single government's ability to control.
The Criticism and Counter-Narrative
Not everyone has accepted Anthropic's framing without scrutiny. Some commentators and researchers have pointed out an apparent tension: Anthropic accuses others of using distillation while itself using the technique to train its own models. Critics argue that the selective application of these concerns — defending against external distillation while practicing it internally — reflects competitive interests as much as genuine safety concerns.
Tech Buzz China analyst Rui Ma noted that Anthropic has long framed compute leadership as a national security priority, consistently advocating for tighter export controls on advanced chips to China: "Whether intentional or not, the narrative of illicit capability transfer strengthens the case for stricter chip restrictions."
On X (formerly Twitter), some users were blunter: "Looks to me like Anthropic is panicked because DeepSeek V4 is going to beat Opus 4.6 on the SWE benchmark... Screaming 'they stole our answers!' isn't really going to fly."
DeepSeek, Moonshot AI, and MiniMax had not responded to press requests for comment at the time of publication.
How Anthropic Is Fighting Back
Anthropic's response to these attacks spans detection, intelligence sharing, access controls, and active countermeasures. The company has been clear that it views this as an escalating arms race and that no single company can solve the problem alone.
Detection Systems
Anthropic has built multiple layers of behavioral fingerprinting and machine learning classifiers specifically designed to identify distillation attack signatures in API traffic. These systems look for the telltale patterns described above: high-volume, narrow-capability targeting with repetitive prompt structures. The company has also built specific detection for chain-of-thought elicitation — the technique where prompts are crafted to make Claude articulate its own step-by-step reasoning, producing especially valuable training data.
Additionally, Anthropic has developed tools to identify coordinated account activity across large account clusters — detecting the coordinated timing, shared infrastructure, and synchronized traffic patterns that distinguish distillation campaigns from organic usage.
Intelligence Sharing
Anthropic has committed to sharing technical indicators — IP addresses, behavioral signatures, infrastructure patterns — with other AI labs, cloud hosting providers, and relevant government authorities. This is a significant step because it acknowledges that effective defense requires a coordinated ecosystem response, not just unilateral action. Labs that share attack signatures can build collective defenses faster than individual actors working in isolation.
Strengthened Access Controls
The company has tightened verification requirements for the access pathways most commonly exploited by distillation campaigns: educational accounts, security research programs, and startup organization accounts. These categories were historically granted easier access as legitimate use cases, making them attractive entry points for fraudulent registrations.
Model-Level Countermeasures
Perhaps most intriguingly, Anthropic is developing what it describes as "Product, API and model-level safeguards designed to reduce the efficacy of model outputs for illicit distillation, without degrading the experience for legitimate customers." This suggests that Claude itself may eventually be modified at a technical level to make its outputs less useful as raw training data for distillation attacks — while maintaining full utility for legitimate end users.
The specifics of these model-level countermeasures have not been disclosed publicly, for obvious reasons.
The Call for Coordinated Action
Anthropic closed its disclosure with a clear call to action: "No company can solve this alone." The company explicitly called for coordinated responses from the broader AI industry, cloud infrastructure providers, and policymakers. Publishing the detailed evidence in full was itself part of this strategy — making the technical documentation available to everyone with a stake in the outcome.
The Bigger Picture: What This Means for the AI Industry
For AI Companies and Developers
The Anthropic disclosure serves as a wake-up call for the entire AI industry about the security dimensions of API access. Every major AI provider that offers API access to its models is, in theory, a target for systematic distillation attacks. The technical barriers to running such an operation are not particularly high — the primary requirement is the willingness to set up fraudulent accounts and the computational resources to run large-scale prompt campaigns.
For developers building on top of AI APIs, these events are a reminder that the models they depend on may be subject to ongoing security threats that affect their quality, reliability, and the competitive landscape of the tools they build with.
For AI Policymakers and Regulators
This disclosure gives policymakers concrete, documented evidence of the mechanisms by which AI capabilities transfer across borders in ways that circumvent regulatory controls. If export controls on chips are meant to limit China's AI development, but that development can proceed through systematic API-based extraction of American frontier models, then the policy framework needs to evolve to address both the hardware and software vectors simultaneously.
The report arrives as Congress, the executive branch, and international bodies are actively debating AI governance frameworks. It provides ammunition for those advocating for stricter export controls and for those calling for new regulations around API access, account verification, and AI model security requirements.
For the US-China AI Competition
Anthropic's disclosure adds a new dimension to the already intense U.S.–China AI competition. The companies named — DeepSeek, Moonshot AI, and MiniMax — rank among the top 15 models on the Artificial Analysis leaderboard and are widely considered to represent China's most advanced publicly accessible AI systems. If their capabilities have been materially augmented through systematic extraction from American frontier labs, it changes the nature of the competitive analysis significantly.
DeepSeek's recent R1 reasoning model, which matched American frontier labs in performance at dramatically lower apparent cost, sparked a global debate about whether U.S. AI leadership was eroding. Anthropic's findings suggest a more complex picture: some of that performance may be derived, at least in part, from American AI research and development that was extracted without authorization.
For AI Safety and Alignment Research
The safety argument Anthropic makes — that distilled models lose their safety guardrails — is one of the most consequential dimensions of this story for the long-term AI safety community. Years of research into Constitutional AI, reinforcement learning from human feedback, and model alignment aim to produce AI systems that remain safe even as their capabilities scale. If those safety properties cannot be reliably transferred through distillation, then the proliferation of powerful but unaligned models represents a systemic risk that extends well beyond any single company's business interests.
Key Numbers: The Scale of the Attacks at a Glance
Total exchanges across all three campaigns: Over 16 million
Fraudulent accounts used: Approximately 24,000
MiniMax campaign size: Over 13 million exchanges (largest single campaign)
Moonshot AI campaign size: Over 3.4 million exchanges
DeepSeek campaign size: Over 150,000 exchanges
Largest single proxy network: Over 20,000 simultaneous fraudulent accounts
MiniMax pivot time after new Claude model release: 24 hours
Google Gemini distillation attempts (separately reported): Over 100,000 prompts
Timeline of Events
Early 2025 — OpenAI begins tracking distillation-like behavior from Chinese labs following the launch of DeepSeek's first high-performing model.
February 2026 (early) — OpenAI submits open letter to U.S. House lawmakers detailing ongoing distillation attempts by DeepSeek and other Chinese labs.
February 2026 (mid) — Google Threat Intelligence Group publishes findings on distillation attacks targeting Gemini's reasoning capabilities.
February 16, 2026 — Anthropic CEO Dario Amodei participates in Anthropic's Builder Summit in Bengaluru, India.
February 24, 2026 — Anthropic publishes comprehensive disclosure of industrial-scale distillation attacks by DeepSeek, Moonshot AI, and MiniMax, naming the companies publicly and providing detailed technical evidence.
February 24, 2026 (ongoing) — DeepSeek, Moonshot AI, and MiniMax have not responded to press requests for comment. Industry and policy reaction continues to develop.
What Happens Next?
Several consequential developments are likely to follow from this disclosure:
Regulatory Response: With documented evidence now publicly available, U.S. legislators and the executive branch are likely to face renewed pressure to address API-based capability extraction in AI policy frameworks, alongside existing hardware export control debates.
Industry-Wide Defensive Upgrades: Other AI labs — particularly those that have not yet publicly disclosed similar attacks — will likely accelerate their own detection and prevention efforts in light of the detailed playbook Anthropic has now made public.
Legal Action: While Anthropic has not announced formal legal proceedings, the detailed documentation of terms of service violations, including specific attribution to researchers at named organizations, creates a foundation for potential civil legal action.
Potential Open-Sourcing of Defense Tools: Anthropic's call for coordinated action may lead to shared defensive infrastructure — standardized detection classifiers, shared ban lists, or industry working groups — that could reduce the burden on any individual company.
Chinese Lab Responses: Whether DeepSeek, Moonshot AI, and MiniMax respond publicly, through their governments, or through technical adjustments to their current operations remains to be seen. Their silence as of publication speaks volumes.
Editorial Analysis: Reading Between the Lines
This story operates on multiple levels simultaneously, and responsible analysis requires acknowledging the competing interests at play.
Anthropic's disclosure is unprecedented in its detail and directness. Naming three specific companies, providing exchange volume figures, describing attribution methodology, and publishing chain-of-thought elicitation examples is a significant escalation from the vague allegations that have characterized earlier discussions of distillation risks. The level of specificity suggests genuine confidence in the evidence base.
At the same time, the disclosure serves Anthropic's broader policy agenda clearly. The company has consistently advocated for export controls and compute restrictions as tools for maintaining U.S. AI leadership. A high-profile disclosure that frames Chinese AI progress as partly dependent on illicitly extracted American technology directly supports that agenda.
Both things can be true simultaneously: the attacks described may be entirely genuine and well-documented, and the disclosure may also serve competitive and policy interests. The appropriate response to these revelations is not to dismiss them as politically motivated, but to evaluate the evidence on its merits while acknowledging the context in which it was published.
What is unambiguous is that the problem of AI model security — protecting the intellectual value of frontier models from systematic extraction by adversarial actors — is real, growing, and unresolved. Anthropic's disclosure, whatever its motivations, has brought that problem to the center of industry and policy attention where it belongs.
Final Takeaways for the AI Community
The Anthropic distillation attack disclosure marks a turning point in how the AI industry thinks about model security. Several key lessons emerge from this story for developers, organizations, and policymakers alike.
✅ Model security is a first-class concern — API access to frontier models is not just a commercial arrangement; it is a potential attack surface that requires active, ongoing defense.
✅ Safety cannot be distilled — The properties that make frontier AI systems safe are not automatically transferred when outputs are used to train competing models. Distilled models built this way represent a proliferation risk for dangerous capabilities without the corresponding safeguards.
✅ Attribution is possible — Anthropic's ability to trace fraudulent campaigns to specific researchers and organizations demonstrates that high-confidence attribution of API-based attacks is achievable, which changes the risk calculus for would-be attackers.
✅ Coordinated defense is necessary — No single company can effectively defend against attacks of this scale and sophistication. Industry-wide intelligence sharing, shared infrastructure protections, and policy frameworks are all necessary components of an effective response.
✅ The AI competitive landscape may look different than it appears — If capabilities presented as the result of independent innovation were materially derived from competitor models, the apparent pace of AI progress across the industry may need to be reassessed.
The AI race is real, the stakes are high, and the methods being used to compete are evolving rapidly. Anthropic's disclosure provides the clearest window yet into one of the most significant security challenges facing the AI industry in 2026 — and makes clear that the response must be just as sophisticated, coordinated, and urgent as the attacks themselves.
This editorial was published on February 24, 2026, based on Anthropic's official disclosure, reporting from Bloomberg, CNBC, CNN, TechCrunch, The Hacker News, and independent analyst commentary. DeepSeek, Moonshot AI, and MiniMax had not responded to press requests for comment at time of publication.