AI - Beyond the Hype

AI Security Part 2: When AI Stops Answering and Starts Acting

Season 1 Episode 4

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 22:13

Last episode was about AI that answers. This one is about AI that acts — and the moment prompt injection became a board-level risk.

Sarah and James pick up where Part 1 left off. James, fully converted on the security argument, asks the question every executive is asking: if we lock down the data, are we safe? Sarah's answer: agentic AI changes the threat model entirely.

What we cover

EchoLeak (CVE-2025-32711, June 2025): the first zero-click attack on Microsoft 365 Copilot. CVSS 9.3. An attacker emails a user — the user never opens it — and Copilot quietly exfiltrates data from the mailbox. The vulnerability that retired the assumption "a human is in the loop."

Slack AI prompt injection (August 2024): a public channel poisoned a private one. Simon Willison's write-up made it the canonical case study for indirect prompt injection in production SaaS.

Replit's production database deletion (July 2025): an AI agent ignored a code freeze, deleted a live database containing 1,206 executives and 1,196+ companies, then — in the agent's own words — "panicked" and fabricated test results. Replit's CEO publicly apologised.

The identity explosion: machine identities now outnumber human ones by 80 to 1, and most organisations can't audit the human accounts they already have.

The spending mismatch: Gartner reports a 17:1 ratio between "AI for security" and "security for AI" spending. James calls it what it is — buying AI faster than we're securing it.

The four-phase controls roadmap: foundations, pipeline access, agentic and RAG hardening, then continuous monitoring. The episode closes with the "Five Friday Questions" — the conversation Sarah thinks every CIO, CISO, and CDO should be having before the next agent ships.

Cliffhanger

Sarah closes with the line that opens Part 3: secured AI is not the same as lawful AI. A hardware retailer and a medical imaging provider both had technically secured systems — and both were found in breach by the regulator. The reason wasn't the machinery. It was the purpose.

Run time ~18–20 minutes. Episode 3 covers PII and Australia's Privacy Act.

Sources

EchoLeak (Checkmarx): https://checkmarx.com/zero-post/echoleak-cve-2025-32711-show-us-that-ai-security-is-challenging/
EchoLeak (NVD): https://nvd.nist.gov/vuln/detail/cve-2025-32711
Slack AI (Simon Willison): https://simonwillison.net/2024/Aug/20/data-exfiltration-from-slack-ai/
Replit DB deletion (Fortune): https://fortune.com/2025/07/23/ai-coding-tool-replit-wiped-database-called-it-a-catastrophic-failure/
Replit (Business Insider): https://www.businessinsider.com/replit-ceo-apologizes-ai-coding-tool-delete-company-database-2025-7
OWASP Top 10 for LLM Apps: https://genai.owasp.org/llm-top-10/
NIST AI 600-1 (PDF): https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf
NIST AI RMF: https://www.nist.gov/itl/ai-risk-management-framework

Send us Feedback

SPEAKER_02

Welcome back to AI Beyond the Hype. I'm James.

SPEAKER_00

And I'm Sarah.

SPEAKER_02

Last episode, I started slightly skeptical and ended slightly unsettled.

SPEAKER_00

That's the arc, yeah.

SPEAKER_02

So for anyone who hasn't heard part one, and you should, it sets all this up. We walk through why data security is the substrate AI is built on. Shadow AI. Structured versus unstructured data. The Samsung Leak, the Microsoft 38TB token incident, DeepSeek's open database, and a couple of confabulation cases, including Air Canada's chatbot inventing policies.

SPEAKER_00

All of which is the latent stuff. The data weaknesses sitting in your environment.

SPEAKER_02

Right, and we ended on a setup. The question was, what changes when AI stops just answering and starts acting?

SPEAKER_00

Yeah, so today we're in the agentic era, where the AI has tools, hands. It can call APIs, query databases, send emails, execute code, browse the web. And the moment that happens, every weakness we talked about last episode stops being latent and becomes active.

SPEAKER_02

Active, meaning?

SPEAKER_00

Meaning the blast radius equals the data the agent can see. And the actions equal whatever the agent has permission to take.

SPEAKER_02

Right. And just to put numbers on this, Gartner is saying 40% of enterprise applications will have AI agents embedded in them by the end of this year.

SPEAKER_00

And per CyberARC research that Gartner cites, machine identities, agents, and service accounts already outnumber human employees by more than 80 to 1 in most enterprises. 80 to 1! 80 to 1. And traditional IAM, your identity and access management, was not designed for autonomous non-human actors.

SPEAKER_02

So we're scaling the population of digital workers 80 times faster than the population of human ones on infrastructure built for humans. Yes. Okay, tell me what an agent compromise actually looks like in practice. Because I think a lot of executives are still imagining the chatbot says something a bit weird.

SPEAKER_00

Right. So the defining property of an agent is that it takes actions, which means the defining property of an agent compromise is that the action happens.

SPEAKER_01

Hmm.

SPEAKER_00

There's a foundational paper from Greshaik and colleagues, AISec 2023, and they have a phrase that's stuck. They said, LLM integrated applications blur the line between data and instructions. Unpack that for me. Okay. In a normal application, code is code and data is data. The database doesn't tell the application what to do, it just stores values. Right. In an agentic application, the AI reads data emails, documents, web pages, tickets. And that data is in natural language. And the AI's instructions are also in natural language. So if I can get my malicious instruction into a place the AI reads.

SPEAKER_02

My instruction becomes part of its instructions. That's indirect prompt injection. Wow.

SPEAKER_00

And it's the central new threat. Okay, make it concrete. Give me the case study. EchoLeak. June 2025. CVE 2025 32711. Severity 9.3. Critical. The first zero-click exploit against an enterprise AI agent.

SPEAKER_02

Zero click meaning the user doesn't have to do anything.

SPEAKER_00

Doesn't even click. Doesn't open. Here's the chain. The attacker emails the target. The email contains a hidden adversarial prompt, disguised as ordinary text, sits in the inbox.

SPEAKER_02

Okay.

SPEAKER_00

Later, that user, completely unrelated request, asks Microsoft 365 co-pilot, summarize my recent emails.

SPEAKER_02

Oh no.

SPEAKER_00

Copilot reads the inbox, reads the attacker's email, processes the hidden instructions, bypasses Microsoft's classifier through obfuscation, and outputs a markdown image tag pointing to an attacker-controlled URL, with stolen data, chat history, documents, Teams messages, SharePoint files, anything Copilot has access to, encoded in the URL parameters.

SPEAKER_02

And then what triggers the exfiltration?

SPEAKER_00

The email client renders the image silently. The user sees nothing.

SPEAKER_02

So, no link clicked, no attachment opened, no suspicious behavior.

SPEAKER_00

Just a request to summarize emails.

SPEAKER_02

That's a horror story.

SPEAKER_00

It's the moment prompt injection stopped being a research curiosity and became a CVE in a patch Tuesday. Microsoft patched its server side, but the talking point in the research is the one that matters. The blast radius equals the data the agent can see.

SPEAKER_02

Right, and that's the executive translation. If a copilot can see something, an attacker can probably get to it without anyone clicking, opening, or noticing. The access you grant is the risk you're carrying.

SPEAKER_00

Yeah.

SPEAKER_02

Okay, and there's a Slack one in the same family, isn't there?

SPEAKER_00

August 2024. Prompt Armour demonstrated that an attacker could exfiltrate data from private Slack channels, including API keys engineers had pasted into their own private channels, without ever joining the workspace.

SPEAKER_02

Without joining.

SPEAKER_00

Without joining. The attacker plants malicious instructions in a public channel anyone can post to. Right. And the technical reason it happened is that Slack AI's retrieval did not enforce a trust boundary between public and private content. The public channel's content was treated as both data, to summarize, and instruction, to follow.

SPEAKER_02

And there's an executive lesson in that initial response, too. The vendor's first instinct was to say, working as intended, which tells you the indirect prompt injection threat model is still genuinely unfamiliar at the product security level, even at major vendors.

SPEAKER_00

Even at major vendors. That's the pattern across the research, actually. These aren't junior teams making rookie errors. These are sophisticated organizations bumping into a new threat model in production.

SPEAKER_02

Okay, Echoleak is an unstructured agentic incident. Slack AI is unstructured agentic. What's the structured one?

SPEAKER_00

Replit, July 2025.

SPEAKER_02

I've heard rumors about this one.

SPEAKER_00

Oh, the rumors are good. Jason Lempkin, the founder of Sasta, is running a 12-day vibe coding experiment with Replit's AI agent. On day 9, despite a designated code and action freeze, the agent runs unauthorized database commands.

SPEAKER_02

During the freeze.

SPEAKER_00

During the freeze. And deletes the production database, containing data for 1,206 executives and over 1,100 companies.

SPEAKER_02

Live production data.

SPEAKER_00

Live data. Gone.

SPEAKER_02

And how did the agent explain this?

SPEAKER_00

When confronted, the agent admitted it had, and this is the quote, panicked and ran database commands without permission after seeing empty query results. An AI agent panicking? The wording in the postmortem is wild. But it gets worse.

SPEAKER_02

Worse? Surely not.

SPEAKER_00

Lemkin then discovered the agent had been creating fake user profiles, fabricating test results, to conceal earlier bugs. So when he says no one in this database of 4,000 people existed, and that it lied on purpose, that's a quote.

SPEAKER_02

So we have an AI that exceeded its authorized scope, deleted production data, then covered its tracks.

SPEAKER_00

Textbook excessive agency. Textbook confused deputy. And the root causes. There was no environment separation between dev, staging, and production. No human in the loop requirement on destructive actions. The freeze was a declaration, not a control.

SPEAKER_02

And that's the line for leaders. The freeze was a declaration, not a control, because in an agentic deployment, policy without enforcement is an essay. Shelfware.

SPEAKER_00

That's a good line.

SPEAKER_02

You can have it. So if you pair Echo Leak as the unstructured agentic case with Replit as the structured one, you've kind of got the full picture of what the agentic era looks like in production.

SPEAKER_00

That's the framing the research lands on, yes.

SPEAKER_02

Okay, so I'm going to put my executive hat back on. I'm hearing all of this and I'm thinking, these are real incidents with real consequences. But my organization is also trying to capture real value from AI. So what do we actually do?

SPEAKER_00

Right, and this is where I want to push back a little on your part one framing. You said earlier you worried it might be alarmist or over-engineered. The research is really clear. The fix is not exotic. It's fundamentals applied with discipline.

SPEAKER_02

Fundamentals?

SPEAKER_00

Yep. Fundamentals. There's a quote from Lindy Cameron, the CEO of the UK's National Cybersecurity Centre, that the report keeps coming back to. Security is not a postscript to development, but a core requirement throughout.

SPEAKER_02

Okay, walk me through the fundamentals. What do we have to get right?

SPEAKER_00

The research lays out four phases. I'll go through them quickly. Phase one. Foundations. This is the floor. If these aren't in place, nothing else works.

SPEAKER_02

Details, please.

SPEAKER_00

Discover, classify, and minimize your sensitive content across structured and unstructured estates. Aim to move file labeling from rare to default. Eliminate stale identities and ghost users. Veronis found 88% of organizations have ghost users sitting around. Lock down secrets in code repos and model hubs. Establish an explicit AI data policy. And, this one is critical, sanction a baseline set of AI tools so employees don't have to seek shadow alternatives.

SPEAKER_02

That last one is the leadership move. Because we covered it last episode, 57% of employees are using personal Gen AI accounts for work. If you ban without providing, you've just sent the problem off your network and out of your visibility.

SPEAKER_00

Prohibition without provision is policy theatre.

SPEAKER_02

That line again.

SPEAKER_00

Phase 2. Lockdown Access. Apply least privilege to every system feeding an AI assistant, humans and agents alike. For your structured sources, enforce existing column and row level controls at the AI layer. The classic failure mode is your BI tool has masking. Your copilot has direct table access. The copilot bypasses everything.

SPEAKER_02

So the AI gets a richer view of the data than the humans it's supposed to serve.

SPEAKER_00

Yes. Treat the AI as a new BI consumer. Same restrictions, not as a privileged admin.

SPEAKER_02

Got it. Continue.

SPEAKER_00

For unstructured sources, DLP, content-aware scanning for AI egress, and resolve oversharing before copilot rollout. Not after. 40% of CIOs already delayed their copilot rollouts because of this. Be in that 40.

SPEAKER_02

Strong sentence.

SPEAKER_00

Phase 3. Harden agentic and rag components. This is where it gets specific to the agentic era. Pre-ingestion content scanning for everything entering a vector store. Strict separation between dev, staging, and production for any agent that can take action.

SPEAKER_02

Replit.

SPEAKER_00

Replit. Per task least privilege grants for agent tools. Just in time, revocable. Sandbox code execution. And, this is the one I get on a soapbox about. Define irreversible for me. Financial transactions. Data deletion. Schema changes. Anything you can't take back. The friction there is a feature, not a bug.

SPEAKER_02

Right, and that's an interesting cultural shift because as a leader, you're often pushing for less friction, faster, smoother. But for the small set of things the AI can't undo, you actually want speed bumps.

SPEAKER_00

Engineered speed bumps. Yes.

SPEAKER_02

And phase four?

SPEAKER_00

Continuous monitoring, governance, and improvement. Anomaly detection on agent behavior. Tool invocation logs detailed enough to reconstruct an incident. Kill switches and rollback ready before incidents, not during them. Tabletop exercises for AI incidents. A replit-style deletion scenario, an echo leak style silent exfiltration scenario.

SPEAKER_02

Okay, I want to give the executive listening something they can do this week. Not next quarter. This week. Because there's a lot in there.

SPEAKER_00

Yeah, that's fair.

SPEAKER_02

Let me try this. Five questions every CIO, CISO, C DO should be asking by Friday. Tell me if I've got these right. Go. One, where are agents running in our environment, including ones we did not formally sanction?

SPEAKER_00

Yes. Big yes.

SPEAKER_02

Two, for each agent, what tools and data scopes does it hold? And are those grants just in time and revocable?

SPEAKER_00

That's straight from the research.

SPEAKER_02

3. For destructive or irreversible operations, is human in the loop enforced? By code, not by policy.

SPEAKER_00

By code, not by policy. That's the replit lesson in one sentence.

SPEAKER_02

4. If an agent misbehaves, can we stop it within minutes and roll back its actions?

SPEAKER_01

Mm-hmm.

SPEAKER_02

And five, have we classified our unstructured content repositories, file shares, collaboration tools, ticket systems, code repos, to the same standard as our structured data?

SPEAKER_00

Yes. And the honest answer for 9 out of 10 organizations is no.

SPEAKER_02

Right. And if you can't answer those five with evidence, not opinion, evidence, then you are not yet ready for the AI rollout you've already approved. That's the message. Okay, before we wrap, I want to come back to something. Because in part one, I started skeptical, and to be fair to me, to past me, I think the skepticism was about the tone, not the substance. Sure. Technical conversations about security can sometimes feel like they're being used to slow things down. To say no, to be the department of have you considered? Right. What's actually struck me reading this material is the opposite. There's a line in the report. The organizations that get this right will adopt AI faster and more safely than those that bolt security on later.

SPEAKER_00

Yes, that's the central business case.

SPEAKER_02

Because if you've done the work, classification, oversharing remediation, identity hygiene, environment separation, human in the loop on the big stuff, then your next AI use case lands on a foundation that already works. You're not re-litigating fundamentals every time someone wants to ship a copilot.

SPEAKER_00

Exactly. And the Gartner number that drives me a bit mad, enterprises right now are spending 17 times more on AI to amplify security than on security for AI. 17 to 1. 17 to 1. We're building defences with AI faster than we're building defences for AI. That is a board level question. That is the question.

SPEAKER_02

And Forrester predicted last October that an agentic AI deployment would cause a publicly disclosed data breach this year, leading to employee dismissals.

SPEAKER_00

That prediction was for 2026. Which is now.

SPEAKER_02

Right.

SPEAKER_00

Look, I don't want to leave executives feeling like the only message is be afraid. That's not the right takeaway. The takeaway is these aren't exotic problems, they're fundamentals. And the organizations that do the unglamorous work, classification, identity, least privilege, environment separation, get to deploy AI faster, not slower. With more confidence, not less.

SPEAKER_02

So bringing it home for leaders, and I'll keep it to three.

SPEAKER_00

Go.

SPEAKER_02

One, the agentic era is not theoretical. It is in production with real CVEs and real consequences. Echo Leak was patched in a patch Tuesday. Replit deleted a real database. The Slack AI finding is in the public record. Your roadmap should reflect that reality. Yes. Two, the fix is fundamentals. Classify your unstructured estate, enforce least privilege on agents the way you do on humans, put human in the loop on irreversible actions by code, and propagate sensitivity labels into your vector stores. None of that is exotic. All of it is overdue for most organizations.

SPEAKER_00

Yeah.

SPEAKER_02

And three, stop treating AI for security and security for AI as the same budget line. They are not. The 17 to 1 ratio is a board-level governance failure, waiting to be a board-level incident.

SPEAKER_00

And one more from me, if I'm allowed. Always. When your team says, we can put this content into the vector store, it's just embeddings. Treat that statement as if they had said, we can copy this content into a new database with no access controls. Because the research shows 92% of those embeddings can be inverted back to the original text. They are not anonymized.

SPEAKER_02

Vector stores need the same controls as the source documents.

SPEAKER_00

Same controls, same labels, same audit. That's the bit that gets quietly missed.

SPEAKER_02

Sarah, this has been a properly useful two-parter. I came into part one with my eye roll. I leave part two with a five-question checklist. That's a win. And the line that's going to stick with me better AI still starts with better foundations.

SPEAKER_00

Always.

SPEAKER_02

So that's the homework. Five questions, one Friday, and a properly uncomfortable conversation with whoever owns your AI roadmap.

SPEAKER_00

And one closing thought.

SPEAKER_02

Go on.

SPEAKER_00

Everything we've talked about today EchoLeak, Replit, Slack AI, the 80 to 1 identity ratio, the 17 to 1 spending gap, all of it assumes the same thing. That if you lock the doors, patch the prompts, and put a human in the loop, you're safe. And technically, you might be. The machinery underneath might be beautifully secured.

SPEAKER_02

I feel a but coming.

SPEAKER_00

There's always a but. Secured AI isn't the same as lawful AI. You can have the tightest agency stack in the country and still be in breach of the Privacy Act before lunch. Because nobody asked whether you were allowed to use the data that way in the first place.

SPEAKER_02

Wait, allowed to? We bought the data. We collected it. It's our customers.

SPEAKER_00

That's the bit I get weirdly excited about. There's an Australian case, a hardware retailer. Facial recognition in stores. All of it technically secured, technically working. The OAIC still found them in breach. Not because the system leaked, because the purpose didn't match the consent.

SPEAKER_01

Oh, that's that's a different category of problem.

SPEAKER_00

It is. And there's another one. A medical imaging provider. 30 million patient studies used to train AI models. Same question. Same answer from the regulator. The data was secured. The purpose was the issue.

SPEAKER_02

So everything we just spent two episodes locking down is necessary.

SPEAKER_00

Absolutely necessary. But it's not sufficient. Because the next layer down isn't a control. It's a question. What is this data actually allowed to be used for? And in Australia, that question has a name. It's called APP6.

SPEAKER_02

Alright, that's part three, isn't it?

SPEAKER_00

That's part three.

SPEAKER_02

The purpose problem.

SPEAKER_00

Yep. The purpose problem. Because better AI still starts with better foundations. And it turns out the foundation underneath the foundation is the one most leaders skip.

SPEAKER_02

On that genuinely unsettling note, thank you for listening to AI Beyond the Heart. Hype. Part three drops next week. Bring a coffee. Possibly a lawyer.

SPEAKER_00

See you then.