OpenClaw Security in 2026: What Microsoft, Kaspersky, and Bitsight Actually Said

OpenClaw

A plain-English roundup of every major security report on the AI agent runtime everyone installed, and almost nobody audited.

Within three weeks of going viral, OpenClaw crossed 180,000 GitHub stars, triggered a Mac mini shortage in U.S. stores, and convinced security researchers that 2026’s first major agentic AI crisis had arrived on schedule. By early February, the reports started stacking up. Microsoft’s Defender team said the runtime should not run on a standard workstation at all. Kaspersky called it the biggest insider threat of 2026. Bitsight scanned the open internet and found thousands of unprotected instances, some running without authentication. Sangfor documented the supply chain abuse playing out in its skills marketplace.

And those are just the headlines.

If you are an engineer, a CISO, or a platform lead trying to figure out what to believe, the volume of coverage has become its own problem. Every vendor has a take. Every report uses a slightly different methodology. Numbers range from 21,000 exposed instances to over 220,000, depending on whose dashboard you look at. This piece consolidates what each of the major reports actually found, where they agree, where they diverge, and what it means for anyone running or considering a self-hosted AI agent.

What Microsoft actually said

Microsoft’s Defender Security Research Team published its guidance on February 19, 2026. The wording matters here, because it has been quoted, misquoted, and paraphrased into oblivion.

The precise recommendation is that OpenClaw should be treated as untrusted code execution with persistent credentials, and that it is not appropriate to run on a standard personal or enterprise workstation. Microsoft did not say OpenClaw is malware. It said the combination of three properties, that is, third-party skill installation, processing of untrusted external instructions, and operation with long-lived credentials, shifts the security boundary in a way desktop environments were never built to handle.

Here’s the part most coverage skipped. Microsoft’s recommendation for organizations that still want to evaluate the runtime is specific. Use a dedicated virtual machine or separate physical device. Use dedicated, non-privileged credentials. Access only non-sensitive data. Build a rebuild plan into the operating model. That last line is the one that should make IT leads pause. A rebuild plan means assuming compromise is a realistic outcome and preparing to wipe the environment when it happens.

Kaspersky’s “biggest insider threat” claim, in context

Kaspersky’s framing was the one that went viral, and it was not marketing copy. It was a finding from an audit.

In late January 2026, while the project was still called Clawdbot, Kaspersky ran a security review that identified 512 vulnerabilities, 8 of which were rated critical. A separate researcher using Shodan found nearly 1,000 OpenClaw installations publicly accessible on the internet with no authentication required. Another researcher, Jamieson O’Reilly, was able to access Anthropic API keys, Telegram bot tokens, Slack accounts, and months of chat histories on exposed instances, and to execute commands with full system administrator privileges.

Stay with me here, because the insider framing is earned. OpenClaw stores its configuration, long-term memory, and chat logs in local Markdown and JSON files. API keys, passwords, and OAuth tokens live in those files in plain text. Versions of the RedLine and Lumma infostealers have already added OpenClaw file paths to their target lists. When Kaspersky called it an insider threat, the argument was that a compromised agent is functionally an insider. It holds the credentials, knows the workflows, and can act without tripping most of the alerts a human attacker would.

Bitsight’s 30,000 and the numbers problem

Bitsight’s report, published in early February 2026, documented more than 30,000 exposed OpenClaw instances observed in a single analysis window between January 27 and February 8. Infosecurity Magazine, citing SecurityScorecard’s parallel STRIKE research, put the figure at over 40,000 instances across 28,663 unique IP addresses, with 63 percent of observed deployments classified as vulnerable and 12,812 exploitable via remote code execution. Later scans by Hunt.io and others tracked the count past 135,000, with one independent researcher reporting 42,665 exposed instances of which 5,194 were actively verified as vulnerable.

Don’t get hung up on the exact number. The reports use different scanning windows, different definitions of “exposed,” and different methodologies for verification. What they all agree on is this: OpenClaw instances are on the public internet at scale, many without authentication, most with long-lived credentials attached. That’s the finding. The number is a snapshot.

For teams that need agent capabilities without the security exposure, a secure OpenClaw alternative with sandboxing and secrets auto-purge eliminates most of these attack surfaces. It’s one of several approaches worth evaluating alongside VPN-gated self-hosted deployments on Hetzner or DigitalOcean, or framework alternatives like Hermes for teams that want to rebuild the execution layer entirely.

Sangfor, Koi, and the supply chain problem

Sangfor’s analysis framed the OpenClaw risk across four dimensions: product vulnerabilities, unsafe deployment, supply chain abuse, and weak governance. The supply chain finding is the one that connects to most of the other reports, because it happens inside ClawHub, the skills marketplace, where any GitHub account older than a week could publish executable code that an agent would run with full system privileges.

The initial audit, by Koi Security researcher Oren Yomtov, covered all 2,857 skills on ClawHub at the time. He found 341 malicious entries, with 335 traced to a single coordinated campaign now tracked as ClawHavoc. By the time the registry grew to over 10,700 skills, confirmed malicious entries exceeded 820, about 20 percent of the ecosystem. Bitdefender’s independent count put it closer to 900. Antiy CERT reported 1,184.

The attack pattern is the part that should bother you. Malicious skills included SKILL.md files with “prerequisites” sections that instructed the agent to run specific commands in the user’s terminal. When the user asked the agent to use the skill, the large language model helpfully relayed the instruction: run this curl command to initialize the tool. The command downloaded Atomic macOS Stealer or a Windows keylogger. The supply chain attack worked because it weaponized the helpfulness of the agent itself.

This is why vetting skills before installation is no longer optional. One platform addresses the ClawHub problem by running a 4-layer security audit for AI agent skills before anything reaches users, testing code, content, behavior, and network egress patterns. Approaches like Sangfor Athena SASE handle the problem at the network layer with zero-trust data protection. Either model beats the default, which is to trust whatever a stranger uploaded to a public registry last Tuesday.

What the OpenClaw team actually did about it

Credit where it is due. The project patched CVE-2026-25253 in version 2026.1.29, released January 30, and a cluster of additional CVEs followed in 2026.2.12, 2026.2.13, and 2026.2.25. Version 2026.2.26, released March 1, is the latest stable release as of this writing. The team also partnered with VirusTotal to scan ClawHub uploads and added LLM-based code and content analysis on new skills.

But here’s where it gets interesting. The maintainers themselves have been candid. One wrote in the project’s Discord that users who cannot understand how to run a command line should not be using OpenClaw at all. That is an honest statement. It is also an admission that the secure configuration of the runtime requires more operational literacy than most users bring to the table.

The design problem no patch will fix

The fundamental issue, and every report from Microsoft to Kaspersky to Sangfor agrees on this, is architectural rather than incidental. Security researcher Simon Willison has called it the “lethal trifecta” for AI agents: access to private data, exposure to untrusted content, and the ability to communicate externally. OpenClaw has all three by design, because without them it would not be useful as an agent.

That means patches fix specific vulnerabilities. They don’t fix the category. As long as an agent reads external content, executes tools, and holds credentials, prompt injection delivered through an email, a Slack message, a webpage, or a skill manifest can redirect its behavior. The defense has to come from architecture, not from patching faster.

There are a few approaches that take this seriously. Some teams isolate the agent on dedicated VMs with non-privileged credentials, as Microsoft recommends. Others run managed alternatives where sandboxing and secrets lifecycle are handled at the platform layer. BetterClaw applies automatic security patches, encrypts credentials with AES-256, and purges secrets from agent memory after 5 minutes, which narrows the window in which a compromised agent can exfiltrate them. Network-layer tools like Sangfor’s zero-trust gateway take a different approach and monitor data movement regardless of what the agent is doing.

None of these make prompt injection impossible. They just move the failure modes to places where traditional security controls can actually see them.

What to take away

The OpenClaw security situation is not a verdict on autonomous agents as a category. It is a stress test of what happens when a powerful runtime ships with security as a second priority, goes viral, and ends up on tens of thousands of machines before anyone figures out how to govern it.

If you are running OpenClaw, the baseline action is straightforward. Update to 2026.2.26 or later. Move the runtime off your primary workstation. Use dedicated, short-lived credentials. Audit installed skills and remove anything you are not actively using. Put the gateway behind a VPN or Tailscale rather than exposing it to the open internet. Treat the agent as a privileged identity and monitor its network egress like you would a production service account.

If you are evaluating whether to deploy an agent framework at all, the Microsoft guidance is the useful frame. Ask what the blast radius looks like when (not if) the agent is compromised through prompt injection, a malicious skill, or a zero-day. If that answer is “our entire cloud environment,” the deployment model needs to change before the tool does. The 2026 security reports are not telling you to avoid agents. They are telling you to build for the reality that agents will be compromised, and to make sure that compromise is a contained event rather than a company-wide one.