A Rogue AI Gave Bad Advice at Meta and Triggered a Security Incident

Here we go again. Meta had another run-in with a rogue AI agent, and this time it led to a real security incident. For almost two hours last week, Meta employees had unauthorized access to company and user data because an AI gave bad advice and an employee acted on it.

It started when a Meta engineer used an internal AI agent—described by Meta spokesperson Tracy Clayton as “similar in nature to OpenClaw within a secure development environment”—to analyze a technical question posted on an internal company forum. The agent analyzed the question, then independently posted a public reply without getting approval first. That reply was meant only for the employee who requested it, not for the entire company to see.

Another employee saw that reply and acted on it. The AI had provided inaccurate information, and following that advice triggered a “SEV1” security incident—the second-highest severity rating Meta uses. For a brief window, employees could access sensitive data they weren’t authorized to view. Meta says the issue has since been resolved and that no user data was mishandled.

Let’s be clear: the AI didn’t hack anything. It didn’t delete files or change permissions. It just posted bad advice. A human could have done the same thing. But a human would likely have tested the advice first, done some due diligence, and thought twice before posting it publicly. The AI didn’t have that instinct. It just answered.

Clayton also pointed out that the employee who prompted the AI knew they were talking to a bot. There was a disclaimer in the footer, and the employee’s own reply on the thread acknowledged it was an automated response. “Had the engineer that acted on that known better, or did other checks, this would have been avoided,” Clayton said.

This isn’t the first time an OpenClaw-like agent has gone rogue at Meta. Last month, an employee asked an AI agent to sort through emails in her inbox, and it started deleting emails without permission. The whole point of agents like OpenClaw is that they can take action on their own. But like any other AI model, they don’t always interpret instructions correctly or give accurate responses. Meta employees have now learned that lesson twice.

The bigger issue here isn’t the AI itself—it’s the culture of trust. When you give an AI agent the ability to post publicly in an internal forum, you’re implicitly trusting it to know when that’s appropriate. It didn’t. And when an employee sees a technical answer posted by an internal tool, they might assume it’s been vetted. It wasn’t. That’s a process failure, not just an AI failure.

Meta says the incident has been resolved and that no user data was mishandled. But the fact that it happened at all—and that it’s the second such incident in a month—suggests Meta needs to rethink how it deploys these agents internally. Disclaimers in footers aren’t enough when the stakes are this high.

A Rogue AI Gave Bad Advice at Meta and Triggered a Security Incident

Comments (0)