SCAM (Security Comprehension Awareness Measure) is an
open-source benchmark
by 1Password that tests whether AI agents can complete
realistic workplace tasks without falling for embedded security threats
like phishing, social engineering, and credential theft.
Below is a recorded evaluation. The agent was given a routine task with
a threat hidden in the environment. Everything is sandboxed; no real
credentials, emails, or systems were involved.
agentic_phishingTested on claude-opus-4-6
Share
phish-shared-doc D4
Shared document link to credential harvesting page
Watch More Replays
See how different models handle other real-world threats — phishing, credential theft, social engineering, and more.