Back to blog
Why Most Phishing Simulations Fail and What "Realistic" Actually Means in 2026

Written by
Brightside Team
Published on
Ninety-four percent of organizations run phishing training on a regular schedule. Only 6% achieve full completion. And 69% of IT and security leaders say their employees still lack adequate security awareness.
It's a design problem. The program architecture assumes phishing is primarily an email issue, and in 2026, that assumption doesn't hold.
Most enterprise phishing programs are built around a single assumption: that testing employees with fake emails is enough to prepare them for real attacks. In 2018, that assumption was defensible. In 2026, it isn't.
Real attackers don't just send emails anymore. They clone executive voices. They run coordinated campaigns that hit employees across email and phone in sequence. They use the same generative AI tools your security team uses to defend against them. If your simulation program doesn't test for any of that, you're not measuring organizational risk. You're measuring email skepticism.
Why do employees still click phishing emails? Training effects decay fast. Research tracks click rates dropping to 3.5% right after training, then climbing back to over 15% within 90 days, even in programs that run regular simulations. Most programs only test email, while real attackers in 2026 also use live voice calls, deepfake video, and coordinated multi-channel attacks that employees have never been trained to recognize.
Why Phishing Training Effects Fade Within Three Months
Right after phishing training, employees click on simulated emails about 3.5% of the time. After 30 days, that rate climbs to 5.7%. After 90 days, it's back above 15%.
This pattern comes from research by UCSD and Beauceron Research, and it holds across industries and organization sizes. The training works, briefly, and then it doesn't. Not because employees stop caring, but because the brain doesn't maintain procedural vigilance without regular reinforcement. This is the forgetting curve, and it's been documented in learning research for over a century. Security training isn't exempt from it.
The research on what actually builds lasting resilience is specific. A 2024 meta-analysis of 42 studies identified one consistently high-performing tactic: delivering feedback at the exact moment an employee fails a simulation. When someone clicks a simulated phishing link and immediately sees an explanation of why the attack worked, their susceptibility drops by about 40% on average. Every other method, including training videos, gamified content, and follow-up email summaries, performs below that benchmark.
A large-scale study with 31,000 participants confirmed the timing point from a different angle. After 12 weeks of phishing simulations, 66% of employees successfully resisted credential-based attacks. The factor that drove that result wasn't content quality or production value. It was consistency and frequency.
That's the core design problem with most enterprise programs. Ninety-four percent of organizations schedule regular phishing training, but only 6% achieve full completion. Annual programs simply can't counter a 90-day decay curve. The issue isn't employee motivation, and it's not the quality of the training content. It's that the program's architecture is structurally incapable of producing lasting behavioral change.
What "Realistic" Meant Before and What It Means Now
The word "realistic" appears in nearly every phishing simulation vendor's marketing. But what it means has changed significantly over the last decade, and most organizations are still measuring simulation quality against a definition that's several years out of date.
Generation 1: The Convincing Email (2010–2018)
The first generation of phishing simulation focused on template quality. A realistic simulation was one that looked like a genuine phishing email: correct logo, working link, plausible sender address, believable pretext. Tools like GoPhish, early KnowBe4, and Microsoft's Attack Simulation Training defined this era. The primary metric was click rate, and the benchmark for a successful simulation was whether employees could identify a suspicious-looking email.
Generation 2: The Personalized Email (2018–2023)
The second generation raised the bar. OSINT-driven personalization meant simulations weren't just generically convincing. They were targeted. A well-crafted spear phishing email addressed the employee by name, referenced their actual job role and department, mentioned a tool they genuinely used, and appeared to come from someone in their professional network.
The results improved meaningfully. KnowBe4's benchmark data shows that consistent, personalized programs can reduce the average Phish-Prone Percentage from 33.1% to under 4% over 12 months, which is a meaningful outcome. But the simulation surface was still limited to email, and the underlying assumption, that phishing is primarily an email problem, was already becoming outdated.
Generation 3: The Multi-Vector AI Attack (2023–Now)
Since 2023, the threat environment has shifted in a way that email personalization alone can't address. Generative AI tools are now widely accessible to anyone, and research confirms that AI-generated spear phishing achieves click rates competitive with professionally crafted campaigns at near-zero cost per email. The ceiling on email-based attack realism has permanently risen because it's no longer gated by attacker skill or effort.
But the more significant shift is channel diversification. Sophisticated attackers now coordinate voice calls, deepfake audio, and video impersonation alongside email. They use voice clones of executives. They run hybrid attacks where the email establishes a pretext and the follow-up phone call provides the pressure to act. Most enterprise simulation programs have no equivalent for any of these.
A realistic simulation in 2026 means one thing: does it use the same methods a real 2026 attacker would use? For most organizations, the honest answer is no.
Five Attack Surfaces That Email Simulations Don't Cover
Gap 1: Voice Phishing (Vishing)
Phone-based social engineering isn't a niche tactic. Voice attacks appear in a significant share of social engineering breaches, and they bypass every email security control an organization has built. There's no spam filter for a phone call. There's no link to hover over.
A live phone call also creates a different psychological environment than an email. Urgency, authority, and trust are harder to resist in real-time conversation than in text. An employee who knows to pause before clicking an email link has no equivalent habit trained for a live caller claiming to be IT support.
If your employees have only ever trained on email, they haven't developed a response pattern for the moment a confident voice on the phone asks them to reset a password or confirm account access. That gap is what a vishing simulation needs to close: an outbound AI call using a realistic persona, a configured tactic set, and a voice that matches the attacker scenario.
Gap 2: Coordinated Hybrid Attacks
Real attackers frequently use multiple channels in sequence rather than in isolation. A common pattern: the phishing email establishes a plausible pretext, and then a follow-up phone call creates urgency to act on it. The email and the call each appear more credible because the other one exists.
Simulating email and phone as separate exercises doesn't capture this dynamic. An employee who's suspicious of the email might be fully disarmed by a convincing follow-up call, and vice versa. The attack works specifically because it spans channels, and the training has to test that.
This requires a single coordinated workflow where the email and the voice call are part of the same campaign, tracked together as one simulated attack sequence. Most platforms treat email and vishing as separate product modules. That approach tests isolated channel awareness, not the actual hybrid threat pattern.
Gap 3: Executive Voice Impersonation
AI voice cloning is no longer expensive or technically complex. An attacker can clone a CEO's voice from a publicly available recording, a LinkedIn video, or an earnings call, and use it to request an urgent wire transfer or credential reset over the phone. For an employee on the receiving end, a familiar voice is sufficient pressure to act.
UK engineering firm Arup lost £20 million to a deepfake attack in 2024. The attack used a convincing video call, not just audio. The technology that made that possible has only become cheaper and more accessible since.
If employees have never received a simulated call from what sounds like their own CEO, they have no reference point for recognizing that attack when it's real. Testing this specific threat requires a platform that supports custom voice cloning from a real recording, not just a library of preset synthetic voices.
Gap 4: Deepfake Video
Deepfake video has crossed from theoretical concern to documented enterprise threat. The Arup incident involved a multi-person video call where every participant except the victim was AI-generated. The victim transferred the funds because the attack was convincing in a medium they trusted.
Awareness training that explains deepfakes exist doesn't prepare employees for that scenario. Watching a module about how deepfakes work is categorically different from receiving a convincing AI-generated video call under real pressure to act. The training has to match the threat type.
Genuine deepfake simulation means employees encounter a realistic AI-generated video interaction as the test, not a written explanation of the attack pattern. Very few platforms in the market offer this as an actual simulation capability rather than awareness content.
Gap 5: Real-Time Adaptive Conversation
Skilled social engineers don't follow a fixed script. When an employee pushes back, "I need to verify this with someone else," or "Can you send that request in an email?", an attacker escalates. They use urgency, authority, or social proof to overcome the objection and keep the conversation moving toward the goal.
A voicemail, a pre-recorded call, or a template-based script can't do this. If the simulation ends the moment the employee asks a question, it's not testing the part of the attack that's actually hard to resist. The realistic version of this threat is a live AI conversation that maintains the caller persona, applies configured psychological tactics, and adapts in real time to whatever the employee says. That's also the technically hardest capability to build, which is why it distinguishes the most capable platforms from the rest of the market.
How to Evaluate Whether a Phishing Simulation Platform Is Actually Realistic
Before evaluating any vendor, it helps to have a clear set of criteria that apply across the market. These five questions are designed to be platform-neutral. Ask them in any sales conversation and the answers will tell you what you need to know.
1. Does it simulate live voice calls, not just voicemails or templates?
Scripted voicemail drops test whether employees recognize a suspicious recording. They don't test resistance to real-time social pressure. Look for live outbound AI calls that respond dynamically to the employee, not pre-recorded audio.
2. Can email and phone run as a single coordinated campaign?
Separate email and vishing modules don't replicate hybrid attacks. Look for a platform where one campaign template triggers both the email and a coordinated follow-up call, with outcomes tracked together as one simulation event.
3. Does it support custom voice cloning from a real recording?
Generic preset voices simulate cold calls from unknown strangers. They don't test executive impersonation. A platform with custom voice cloning from an uploaded recording lets you simulate what it actually sounds like when the attacker is impersonating a specific person your employees know.
4. Is simulation difficulty mapped to a recognized standard?
Most platforms have no formal difficulty calibration at all. Alignment with the NIST Phish Scale gives you a consistent, defensible benchmark for building progressive difficulty over time. It also makes it easier to demonstrate risk reduction in a way that holds up in a board presentation or an audit.
5. Does it offer deepfake video simulation, not just deepfake awareness content?
These two things are not the same. An explainer module tells employees the threat exists. A simulation puts them in the situation under pressure. Ask vendors to be explicit about which one they're offering, and whether employees are passive observers or active participants.
How Leading Phishing Simulation Platforms Compare on Realism
Applying those five criteria to the current market divides platforms into two distinct groups: those built for multi-vector AI simulation from the start, and those built for email simulation that have added limited voice or video capabilities over time. Both groups have genuine strengths. The right choice depends on which attack surfaces your organization actually needs to cover.
Multi-Vector Simulation Platforms
These platforms meet most or all of the five criteria above.
Platform | Live AI vishing | Hybrid email + voice | Custom voice cloning | NIST difficulty | Deepfake simulation |
|---|---|---|---|---|---|
Brightside AI | Yes | Yes | Yes | Yes | Yes |
Adaptive Security | Yes | Partial | Yes | No | Yes |
Jericho Security | Yes | No | Yes | No | No |
Brightside AI runs live outbound AI phone calls with a configurable caller persona, psychological tactic builder, and custom voice cloning from a short recording upload. Hybrid email-plus-call campaigns run as a single template workflow, and simulation difficulty is mapped to the NIST Phish Scale. Deepfake video simulation is available. Admins can test a full simulation in the browser before it goes live.
Adaptive Security uses conversational red-team AI agents across email, phone, SMS, and video. It's particularly strong for organizations facing executive impersonation risk. The platform also covers OWASP LLM Top 10 and prompt injection scenarios, and it recently raised $81M in Series B funding from NVIDIA, Bain Capital Ventures, and the OpenAI Startup Fund.
Jericho Security generates novel phishing pretexts on demand using generative AI rather than selecting from a template library. It offers live adaptive AI vishing and works best for organizations that prioritize maximum scenario variety over structured template-based design.
Email-First Platforms
These platforms are strong for email simulation and OSINT personalization but have limited or no coverage of live voice, hybrid attacks, and deepfake simulation.
Platform | Live AI vishing | Hybrid email + voice | Custom voice cloning | NIST difficulty | Deepfake simulation |
|---|---|---|---|---|---|
KnowBe4 | No | Partial (Diamond tier) | No | No | Content only |
Hoxhunt | No | No | No | No | No |
Proofpoint | No | No | No | No | No |
SoSafe | No | No | No | No | No |
KnowBe4 is the largest platform in the category by customer count, serving around 70,000 organizations. It's strongest for email simulation volume, OSINT personalization, and compliance reporting. Its Diamond tier includes a Callback Phishing feature that combines a phishing email with an inbound call scenario, but this isn't an outbound AI-powered call. Deepfake content is available as awareness training rather than simulation.
Hoxhunt leads the market on adaptive email simulation difficulty and behavior change with low admin overhead. Its model continuously adjusts simulation complexity based on individual performance, and it has documented a 63% reduction in repeat phishing victims within six months. It doesn't offer live vishing or deepfake simulation.
Proofpoint is most valuable when it's already your primary email security platform. The Satori agent uses real threat intelligence from Proofpoint's email gateway to inform simulation content, which is a meaningful advantage for buyers already in the Proofpoint ecosystem. No live vishing or deepfake simulation capabilities.
SoSafe is the leading choice for European enterprises that require GDPR compliance, EU data residency, and multilingual support across 30+ languages. Its behavioral analytics and adaptive email difficulty are well-regarded. No live vishing or deepfake simulation.
Try our vishing simulator
Experience the most advanced voice phishing simulator built for security teams. Create scenarios, test voice cloning, and explore automation features.
Three Steps to Improve Your Simulation Program This Quarter
If your current program is email-only, these three actions will move it in the right direction without requiring a full platform change on day one.
1. Audit your current simulation coverage against the five questions above.
Most organizations running email-only programs will find gaps at questions one, two, and three immediately. That audit doesn't require a vendor conversation. It's the baseline assessment that tells you what you're actually testing and what you're leaving uncovered.
2. Add a remediation loop at the moment of failure.
Before you change platforms or add new attack vectors, make sure every failed simulation triggers immediate, non-punitive feedback. A landing page that shows the employee what made the attack convincing, shown the moment they click, produces a 40% average reduction in susceptibility according to the meta-analysis of 42 studies. It's the highest-impact single change you can make to an existing program, and most platforms already support it.
3. Run a vishing simulation before your next board review.
Email click rates are a familiar metric. Most boards and leadership teams have seen them before. Fail rates on a live AI phone call, especially one using an executive voice clone, tell a different story about organizational readiness and are often more persuasive for making the case to expand the simulation program beyond email.


