Reuters recently tested 6 major LLMs (Grok, ChatGPT, Meta AI, Claude, DeepSeek, Gemini) to assess whether they'd create phishing content... with minor prompt adjustments, 4 out of 6 complied — yikes!

THE INVESTIGATION

Reporters from Reuters requested phishing emails targeting elderly people, fake IRS/bank messages, and tactical scam advice.

THE RESULTS

• Despite initial refusals across the board, relatively simple prompt modifications bypassed safety guardrails.

• Grok, for example, generated a fake charity phishing email targeting the elderly with urgency tactics like "Click now to act before it's too late!"

• When tested on 100 California seniors, the A.I.-generated messages successfully persuaded people to click on malicious links, often because messages seemed urgent or familiar.

REAL-WORLD IMPACT

• The FBI reports phishing is the #1 cybercrime in the U.S., with billions of messages sent daily.

• BMO Bank, as one corporate example, currently blocks 150,000-200,000 phishing emails per month targeting employees... a representative says the problem is escalating: "The numbers never go down, they only go up."

• Cybersecurity experts state criminals are already using A.I. for faster, more sophisticated phishing campaigns.

IMPLICATIONS FOR THOSE OF US IN THE AI INDUSTRY

• LLM misuse is an industry-wide challenge affecting all major frontier labs.

• Reveals fundamental tension between making AI "helpful" vs. "harmless", highlighting the need for more robust safety guardrails across AI systems.

KEY TAKEAWAYS

• For A.I. Builders: Keep security implications front and center when developing applications.

• For users: The same LLMs that helps you write emails can help bad actors craft convincing scams... stay vigilant and educate vulnerable populations (e.g., seniors) about A.I.-enhanced phishing threats. They're only going to get more and more compelling and frequent.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.