Exclusive: New Microsoft Copilot flaw signals broader risk of AI agents being hacked—’I would be terrified’

Microsoft fixed the Copilot flaw, but researchers warn the real danger lies in how all AI agents are built.

Jun 11, 2025 - 13:34

Exclusive: New Microsoft Copilot flaw signals broader risk of AI agents being hacked—’I would be terrified’

Microsoft 365 Copilot, the AI tool built into Microsoft Office workplace applications including Word, Excel, Outlook, PowerPoint, and Teams, harbored a critical security flaw that, according to researchers, signals a broader risk of AI agents being hacked.

The flaw, revealed today by AI security startup Aim Security and shared exclusively in advance with Fortune, is the first known “zero-click” attack on an AI agent, an AI that acts autonomously to achieve specific goals. The nature of the vulnerability means that the user doesn’t need to click anything or interact with a message for an attacker to access sensitive information from apps and data sources connected to the AI agent.

In the case of Microsoft 365 Copilot, the vulnerability lets a hacker trigger an attack simply by sending an email to a user, with no phishing or malware needed. Instead, the exploit uses a series of clever techniques to turn the AI assistant against itself.

Microsoft 365 Copilot acts based on user instructions inside Office apps to do things like access documents and produce suggestions. If infiltrated by hackers, it could be used to target sensitive internal information such as emails, spreadsheets, and chats. The attack bypasses Copilot’s built-in protections, which are designed to ensure that only users can access their own files—potentially exposing proprietary, confidential, or compliance-related data.

The researchers at Aim Security dubbed the flaw “EchoLeak.” Microsoft told Fortune that it has already fixed the issue in Microsoft 365 Copilot and that its customers were unaffected.

“We appreciate Aim for identifying and responsibly reporting this issue so it could be addressed before our customers were impacted,” a Microsoft spokesperson said in a statement. “We have already updated our products to mitigate this issue and no customer action is required. We are also implementing additional defense-in-depth measures to further strengthen our security posture.”

The Aim researchers said that EchoLeak is not just a run-of-the-mill security bug. It has broader implications beyond Copilot because it stems from a fundamental design flaw in LLM-based AI agents that is similar to software vulnerabilities in the 1990s, when attackers began to be able to take control of devices like laptops and mobile phones.

Adir Gruss, cofounder and CTO of Aim Security, told Fortune that he and his fellow researchers took about three months to reverse engineer Microsoft 365 Copilot, one of the most widely-used generative AI assistants. They wanted to determine whether something like those earlier software vulnerabilities lurked under the hood and then develop guardrails to mitigate against them.

“We found this chain of vulnerabilities that allowed us to do the equivalent of the ‘zero click’ for mobile phones, but for AI agents,” he said. First, the attacker sends an innocent-seeming email that contains hidden instructions meant for Copilot. Then, since Copilot scans the user’s emails in the background, Copilot reads the message and follows the prompt—digging into internal files and pulling out sensitive data. Finally, Copilot hides the source of the instructions, so the user can’t trace what happened.

After discovering the flaw in January, Gruss explained that Aim contacted the Microsoft Security Response Center, which investigates all reports of security vulnerabilities affecting Microsoft products and services. “They want their customers to be secure,” he said. “They told us this was super groundbreaking for them.”

However, it took five months for Microsoft to address the issue, which Gruss said “is on the (very) high side of something like this.” One reason, he explained, is that the vulnerability is so new, and it took time to get the right Microsoft teams involved in the process and educate them about the vulnerability and mitigations.

Microsoft initially attempted a fix in April, Gruss said, but in May the company discovered additional security issues around the vulnerability. Aim decided to wait until Microsoft had fully fixed the flaw before publishing its research, in the hope that other vendors that might have similar vulnerabilities “will wake up.”

Gruss said the biggest concern is that EchoLeak could apply to other kinds of agents—from Anthropic’s MCP (Model Context Protocol), which connects AI assistants to other applications, to platforms like Salesforce’s Agentforce.

If he led a company implementing AI agents right now, “I would be terrified,” Gruss said. “It’s a basic kind of problem that caused us 20, 30 years of suffering and vulnerability because of some design flaws that went into these systems, and it’s happening all over again now with AI.”

Organizations understand that, he explained, which may be why most have not yet widely adopted AI agents. “They’re just experimenting, and they’re super afraid,” he said. “They should be afraid, but on the other hand, as an industry we should have the proper systems and guardrails.”

Microsoft tried to prevent such a problem, known as an LLM Scope Violation vulnerability. It’s a class of security flaws in which the model is tricked into accessing or exposing data beyond what it’s authorized or intended to handle—essentially violating its “scope” of permissions. “They tried to block it in multiple paths across the chain, but they just failed to do so because AI is so unpredictable and the attack surface is so big,” Gruss said.

While Aim is offering interim mitigations to clients adopting other AI agents that could be affected by the EchoLeak vulnerability, Gruss said the long-term fix will require a fundamental redesign of how AI agents are built. “The fact that agents use trusted and untrusted data in the same ‘thought process’ is the basic design flaw that makes them vulnerable,” he explained. “Imagine a person that does everything he reads—he would be very easy to manipulate. Fixing this problem would require either ad-hoc controls, or a new design allowing for clearer separation between trusted instructions and untrusted data.”

Such a redesign could be in the models themselves, Gruss said, citing active research into enabling the models to better distinguish between instructions and data. Or, the applications the agents are built on top of could add mandatory guardrails for any agent.

For now, “every Fortune 500 I know is terrified of getting agents to production,” he said, pointing out that Aim has previously done research on coding agents where the team was able to run malicious code on developers’ machines. “There are users experimenting, but these kind of vulnerabilities keep them up at night and prevent innovation.”

This story was originally featured on Fortune.com