The "Lethal Trifecta": There are three factors in agentic A.I. systems that may mean they will never be safe in production. I summarize them and provide potential solutions below.

THE LETHAL TRIFECTA

The dangerous combination occurs when an A.I. system simultaneously has:

1. Access to private data (enterprise databases, customer records, etc.)

2. Exposure to untrusted input (emails, user queries, external data sources)

3. Ability to communicate externally (sending emails, making API calls, generating links)

IMPLICATIONS

• Large language models (LLMs) are naturally compliant; they don't distinguish between data and instructions.

• Malicious instructions hidden in data can trigger prompt injection attacks.

• Result: A.I. systems can be manipulated to read sensitive data and exfiltrate it externally.

• This isn't theoretical... it's already happened... for example, the now-patched "EchoLeak" in Microsoft Copilot: A single malicious email made Copilot extract private documents and hide data inside hyperlinks, thereby clicking sent information straight to attackers.

HOW TO PROTECT YOUR ORG

Ideally, break the trifecta: Remove at least one element. For example, if A.I. handles untrusted inputs, don't give it access to sensitive data or external communications.

If breaking the trifecta is impossible for your use case:

• Dual-model sandboxing: Separate untrusted and trusted A.I. models with constrained interfaces.

• Structured execution frameworks (like Google's CaMeL) that verify actions before execution.

• Minimal access privileges... A.I. gets only what it absolutely needs.

• Input sanitization and output constraints.

• Human-in-the-loop for high-stakes actions.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.