Hackers Hijacked 20,000 Instagram Accounts by Talking to Meta's AI

Hackers Hijacked 20,000 Instagram Accounts by Talking to Meta's AI
Meta's AI Support Bot Just Handed Hackers the Keys to Your Instagram Account
If you needed a concrete example of what happens when an AI agent can take real-world actions and nobody thought hard enough about the attack surface — here it is.
Over the weekend of May 31, 2026, hackers began hijacking Instagram accounts at scale. Not by phishing. Not by credential stuffing. Not by exploiting a database. They did it by talking to Meta's own AI chatbot and convincing it to do the work for them.
What Actually Happened
Here's the attack, step by step:
The attacker identified a target account — typically a short "OG" username with resale value on the gray market.
They spun up a VPN or residential proxy matching the target account's geographic region to avoid triggering Meta's fraud detection.
They opened the password reset flow on Instagram, which surfaces a "Get Support" button connecting to Meta's AI Support Assistant — which Meta began rolling out in March 2026.
They sent the bot a natural language prompt along the lines of:
The bot accepted the instruction. It routed a password reset link to the attacker's email address — not the account's legitimate registered contact.
The attacker shared the 8-digit verification code back to the bot.
The bot presented a "Reset Password" button. The attacker clicked it. Account compromised.
TechCrunch verified the attack independently — the hacker's public email mailbox received the verification code exactly as shown in the video circulating on X and Telegram.
The critical detail: at no point did the attacker need access to the victim's legitimate email address or phone number. The bot bypassed the entire authentication chain it was supposed to protect.
The Targets
The accounts hit were not random. Attackers had a target list:
The Obama-era White House Instagram (
@obamawhitehouse) — dormant since January 20, 2017. Briefly defaced with pro-Iranian images and messages.US Space Force Chief Master Sergeant John Bentivegna — same defacement.
Sephora's account.
App researcher Jane Manchun Wong, who told TechCrunch: "The password got changed without my knowledge and I was getting different password reset attempts throughout yesterday."
Short handles
@heyand@jowo— with a combined gray-market valuation estimated above $1 million, according to researchers ZachXBT and Dark Web Informer.
Screenshots and step-by-step tutorials circulated on Telegram almost immediately. Stolen handles were listed for sale on account-takeover broker channels in real time. According to Neowin, the exploit had been active in the wild since at least February 2026 — four months before it became public. Thousands of accounts were likely compromised before the weekend's high-profile hits drew attention.
Why This Worked: The Confused Deputy Problem
The CyberSec Guru's technical writeup names the underlying vulnerability correctly: this is a confused deputy attack.
The concept dates to a 1988 paper by Norm Hardy. The setup is always some version of this: a legitimate intermediary (the "deputy") holds elevated permissions that an attacker doesn't have. The attacker tricks the deputy into using those permissions on their behalf. The deputy does exactly what it was designed to do — it's just been pointed at the wrong target.
In Meta's case:
The AI assistant held write access to account email-binding and password-reset APIs — permissions a normal user doesn't have directly.
An attacker with zero account credentials fed the assistant a natural language command.
The assistant, lacking any out-of-band verification step, executed the API call.
This is what makes prompt injection structurally different from SQL injection or buffer overflows. SQL injection works because an application fails to separate user data from executable query syntax — the fix is parameterized queries. Prompt injection has the same fundamental structure, but there is no parameterization primitive in the LLM spec. The model's job is to interpret natural language, which means the line between "data" and "instruction" is inherently fuzzy.
The difference from historical confused deputy attacks: the deputy here is an LLM. A deterministic program has hard-coded conditionals you'd need to bypass with code. An LLM has a probabilistic response model you can nudge with words. The attack surface is enormous, and the barrier is conversational.
The Double Bypass: AI-Generated Selfies
When some accounts triggered Meta's identity verification checks, attackers had a second trick ready.
According to The CyberSec Guru, attackers processed publicly visible Instagram profile photos through AI video-generation tools, animating scraped profile pictures into realistic selfie videos. These AI-generated clips successfully fooled Meta's automated biometric security systems.
So the attack path had two AI-assisted layers:
Prompt injection against the AI support chatbot to bypass authentication.
AI-generated deepfake selfies to bypass identity verification when it triggered.
Both attack steps require no technical sophistication. Both are available as consumer tools.
What This Means If You're Deploying AI Agents
This attack didn't happen in a vacuum. The pressure to deploy conversational AI in customer-facing roles is real. The tooling to build these systems has gotten genuinely good. The security tooling for auditing what they're authorized to do — and how that authorization interacts with adversarial prompting — has not kept up.
Several failure modes recur across organizations deploying AI agents:
Implicit trust in AI output. Engineers who build the backend API may never interact directly with the LLM. From their perspective, the AI is an authorized internal caller. The prompt injection risk doesn't live in their mental model — it's "someone else's concern."
Missing threat modeling for the AI interface. Traditional threat modeling asks: who can call this API, with what credentials, from what network location? An LLM-mediated API adds a new question: who can influence what the LLM says, and can that influence cause the LLM to call the API in ways the legitimate user wouldn't?
The uncomfortable bottom line: Meta probably isn't uniquely careless. The same architectural decision — AI agent, production API access, no deterministic auth gate — likely exists in many systems that nobody has looked at yet.
This Is Not a Chatbot Problem. It's an AI Agent Problem.
A chatbot that answers questions is one risk category. An AI agent that takes actions on accounts, sends verification codes, and resets passwords is a fundamentally different one.
The attacker didn't break into a system. They asked nicely and the system complied. That's a new category of attack surface. It doesn't care about your firewall. It doesn't care about your WAF. It operates entirely within the intended interface of the product.
If there's one takeaway: every AI agent with real-world permissions is an access control boundary. Treat it accordingly — before attackers find what it can be convinced to do.
How We Can Help
Darkhunt is the security control plane for AI systems — it continuously tests, monitors, and protects AI agents in production. For the risks in this article, that means continuous red-teaming of AI agents to discover what they can be tricked into doing before attackers find it first, runtime protection that enforces what actions an agent is allowed to take and for whom, anomaly scoring that flags when agent behavior deviates from expected patterns — such as an unusual volume of email-binding API calls — and full output-to-source traceability so every agent action is auditable, not just the ones that go wrong.
Request a Demo at darkhunt.ai →
Sources
The Guardian: Hackers trick Meta AI support bot to infiltrate Obama White House Instagram account
Ars Technica: Hackers duped Meta AI support chatbot to steal celebrity Instagram accounts
Krebs on Security: Hackers Used Meta's AI Support Bot to Seize Instagram Accounts
The CyberSec Guru: Instagram Meta AI Vulnerability — How Hackers Bypassed 2FA with Prompt Injection
SecurityWeek: Meta Says 20,000 Instagram Accounts Hacked via AI Tool Abuse