Back to blog

What Is Deepfake BEC? How Voice Cloning Replaced the Wire Transfer Email

Articles

Articles

Written by

Brightside Team

Published on

Deepfake BEC (Business Email Compromise) is a fraud attack where criminals use AI-generated voice or video to impersonate a trusted executive, most often the CFO or CEO, and manipulate finance staff into authorizing a wire transfer or handing over credentials. Unlike classic BEC, which relies entirely on email, deepfake BEC combines synthetic voice with traditional social engineering across multiple channels. It's harder to spot, harder to filter, and costs organizations significantly more per incident.

In February 2024, a finance employee at a multinational firm in Hong Kong transferred $25 million to fraudsters after joining a video conference where every other participant, including the CFO, was a deepfake. He had doubts before the call. The deepfake video erased them. That case made headlines. What didn't get as much attention is how quickly the tools behind it became accessible to anyone willing to pay for them.

What Classic BEC Looks Like, and Why It Still Works

Traditional BEC is straightforward. An attacker researches a company, identifies a finance employee and an executive whose name carries authority, then sends an email that appears to be from that executive. The email creates urgency: an acquisition needs funding, a vendor needs payment, a regulatory deadline is today. The request is for a wire transfer.

It works because it exploits two things that have nothing to do with technology: trust in authority, and pressure to act fast. Finance staff are trained to process legitimate urgent requests from executives. That's their job. BEC frames fraud as a legitimate urgent request and lets human psychology do the rest.

Classic BEC caused $2.77 billion in reported losses in the United States in 2024, across 21,442 confirmed complaints to the FBI. And that's only what gets reported. The actual number is higher.

But email-based BEC has limits. Email security tools, specifically DMARC, SPF, and DKIM (authentication protocols that verify whether a sender's domain is legitimate), started catching spoofed addresses. Security awareness training taught employees to scrutinize sender fields and hover over links before clicking. The attack became more recognizable. Attackers noticed, and moved to a channel that none of those tools could touch.

What Deepfake BEC Actually Is

Deepfake BEC takes the same objective, getting a finance employee to authorize a transfer, and replaces the spoofed email with something far more convincing: the sound of a voice the employee already trusts.

The attack runs in four stages.

  1. Intelligence gathering. Attackers collect audio of the target executive from publicly available sources: earnings calls, investor webinars, conference recordings, YouTube interviews, LinkedIn videos. A CFO who has presented at a company all-hands already provided the source material. Three seconds of clean audio is enough to start building a voice model. Thirty seconds produces a high-quality clone.

  2. Voice clone creation. Using widely available AI tools, many of them cheap or free, the attacker trains a voice model on the collected audio. The resulting model generates speech in the executive's voice on demand, with the right accent, cadence, and intonation.

  3. Multi-channel execution. The attack rarely arrives as a single call. A phishing email establishes the pretext first: an urgent acquisition, a confidential regulatory issue, a vendor payment that can't wait. Then a follow-up call arrives from the CFO's cloned voice, reinforcing the instruction. For higher-value targets, the sequence ends with a deepfake video meeting that appears to confirm everything in person.

  4. Fund extraction. The finance team member, deceived by multiple independent signals that all point to the same conclusion, authorizes the transfer.

Why the Multi-Channel Design Is What Makes It Work

Each channel in the sequence does a specific job. The email arrives first and frames the situation: a confidential acquisition, an urgent regulatory deadline. The phone call follows, and this is where the voice clone does its real work. The finance employee hears a familiar voice confirming everything. By the time a video meeting appears to close out the sequence, doubt has nowhere to go.

This is the part most phishing training misses entirely. Classic security awareness programs teach employees to scrutinize email cues: the sender domain, unexpected links, urgency language. None of that pattern recognition applies when the attack arrives as a live phone call from a voice that sounds exactly like the CFO, following up on an email the employee already received.

The authority gradient makes this worse. A steep authority gradient means the power gap between an executive and a junior finance employee is wide enough that the employee feels pressure to comply without questioning. Finance teams work under exactly this condition. When the voice on the call is the CFO's and the request is marked urgent and confidential, the organizational instinct is to act rather than pause and verify.

Traditional BEC vs. Deepfake BEC: What Changed

Dimension

Traditional BEC

Deepfake BEC

Primary channel

Email only

Email + phone call + video conference

Identity signal

Spoofed domain or compromised inbox

AI-cloned voice, face, and mannerisms

Main detection cue

Odd sender domain, grammar errors, unexpected request

Near-zero: 68% of deepfakes are indistinguishable from authentic media

Audio or video required?

No

Yes: as little as 3 seconds of source audio

Blocked by email security tools?

Partially (DMARC, SPF, DKIM)

No: the attack runs over phone or video

Average loss per incident

Lower

$500,000+ for enterprises; ~$680,000 at larger organizations

Human detection under pressure

Moderate: visible cues in the email

Low: detection rate for high-quality synthetic voice is around 24.5%

Primary human defense

Train employees to question email signals

Train employees to verify every request, regardless of how authentic the voice sounds

Voice phishing attacks surged 442% from the first half of 2024 to the second half, according to CrowdStrike's 2025 Global Threat Report. AI-enabled fraud is projected to reach $40 billion by 2027. Those numbers reflect a structural change in how attacks are built, not a seasonal spike.

Three Myths Getting Finance Teams Compromised

Myth 1: "We'd spot it. AI voices don't sound real enough to fool us."

This was accurate a few years ago. It isn't now. When a scammer cloned the Ferrari CEO's voice in July 2024, the attack wasn't stopped because the target heard something artificial. It was stopped because the attacker couldn't answer a personal question the target asked to verify identity. The voice was convincing. The verification protocol is what caught it, not the employee's ear.

Expecting employees to detect a voice clone by ear isn't a security control. Human detection rates for high-quality synthetic voice sit at around 24.5% even among security-aware individuals. That means in roughly three out of four attempts, a well-built clone gets through undetected.

Myth 2: "We have email security tools — we're covered."

DMARC, secure email gateways, and anti-phishing filters do valuable work. But deepfake BEC doesn't need your email infrastructure. The attack runs over phone or video, entirely outside the reach of any email-layer control. An organization can have flawless email security and still lose $25 million to a CFO deepfake call. These are parallel attack surfaces, and protecting one does nothing to protect the other.

Myth 3: "Our employees passed the phishing training. They know what to look for."

Phishing training builds one specific skill: pattern recognition for suspicious email signals. Odd sender domain? Check. Unexpected link? Hover before clicking. Urgency request? Pause and think. That skill set is genuinely useful against email-based attacks. It transfers to nothing when the threat is a live phone call from a voice that sounds like the person the employee reports to.

Recognizing a phishing email and responding correctly to a convincing deepfake call are completely different cognitive tasks. An employee with a perfect phishing quiz score has exactly zero trained response to an AI voice caller. They've never practiced that one.

The Business Cost Beyond the Wire Transfer

The immediate cost is the obvious one: a fraudulent wire transfer that's rarely recoverable in full. But the downstream costs are what most risk assessments undercount.

Finance directors who authorize transfers without documented verification protocols face personal fiduciary exposure in some jurisdictions. If an organization's internal controls don't require a callback or dual approval for large transfers, and a deepfake attack exploits that gap, the question of who bears responsibility becomes complicated — and sometimes legal.

Cyber insurance policies add another layer of exposure that many organizations haven't examined. Some policies that cover standard BEC fraud explicitly exclude deepfake-originated wire fraud. Organizations that believe they're insured against this category of loss may not be. That needs to be verified with a broker before an incident, not after.

Investigation creates its own problems. Phone-based deepfake attacks leave no audio artifact to analyze after the fact. The attacker called, the voice conversion happened in real time, the call ended, and nothing forensically identifiable remains. This creates a genuine attribution challenge: was it an external deepfake attack, or an insider using the plausibility of deepfake fraud as cover? Organizations face that question without the evidence to answer it cleanly, which complicates both the insurance claim and the internal review.

Finally, there's regulatory exposure for finance and banking-sector organizations. An undocumented deepfake fraud incident raises questions about the adequacy of internal financial controls, questions that auditors and regulators are increasingly prepared to ask.

Why Awareness Training Alone Can't Prepare Finance Teams

Security awareness training programs have a well-documented limitation: they improve knowledge about a threat without reliably changing behavior when the threat appears. An employee who can describe exactly how a deepfake BEC attack works, the voice clone, the multi-channel sequence, the authority pressure, can still comply with a realistic live call under pressure. Knowing what the attack is doesn't automatically produce the right response to it.

The verification reflex, the instinct to stop, pause, and call back on a known number before acting, requires practice under realistic conditions. Reading a policy document doesn't build that reflex. A persuasive email warning employees about deepfakes doesn't build it either.

Effective training for deepfake BEC starts by resetting what employees expect a fake voice to sound like. Most people's mental model of "AI audio" is robotic and obvious; they've never heard a high-quality clone. From there, training has to give finance staff practice invoking the out-of-band verification protocol while a convincing voice is applying pressure on the other end. And it needs to test whether the full verification chain actually works when someone uses it, not just whether the policy exists on paper.

Generic vishing simulations, where an employee receives a call from an unfamiliar preset AI voice with an invented name, provide some value. But they don't condition the specific response that matters. When the real attack arrives, it won't sound like an unknown AI voice. It'll sound like the CFO. If employees have only ever been tested against unfamiliar voices, they haven't been tested against the actual threat.

How Simulation Platforms Handle Deepfake BEC Training

The critical differentiator among security awareness training platforms for this specific threat is whether the platform can simulate the voice employees actually trust, or whether it can only play a generic AI voice the employee has never heard.

A preset AI voice with an unfamiliar name raises low-level suspicion. The CFO's cloned voice, deployed in a realistic scenario with proper social engineering tactics, does what a real attacker would do. It tests the exact behavior that needs to be conditioned.

Platform

Live AI Vishing

Custom Executive Voice Cloning

Hybrid Attack (Voice + Email)

Deepfake Video Sim

Vishing-Specific Metrics

Brightside

✅ Self-serve: admin uploads 1–2 min recording

✅ Single unified workflow

✅ Answer rate, fail rate, call duration

Jericho Security

△ Documented in secondary sources; managed service, not self-serve

❌ No single workflow

△ Managed service only

Hoxhunt

△ Custom executive voice available; not self-serve admin upload

△ Email + mock video call page

△ Mock video call; not autonomous simulation

Adaptive Security

△ Live-call format not publicly confirmed

△ Uses open-source exec data; self-serve clone unconfirmed

△ Email + voicemail combination

KnowBe4

△ Diamond tier only

△ Educational content only, not a simulation attack

Proofpoint

△ Not a core documented feature

△ = partial or conditional capability

How Brightside's Executive Voice Cloning Works

Brightside's vishing simulator gives admins full control over the attack scenario, including the voice the simulation uses.

An admin navigates to the Voices Library and uploads a 1–2 minute recording of the executive they want to simulate: a clip from an earnings call, a conference presentation, or an internal all-hands recording. The platform's AI clones the voice and adds it to the custom voices library alongside preset options.

When building the simulation template, the admin selects the cloned voice in Step 4. The template is built in five structured steps: define the attack goal (what the call is trying to extract), set the caller persona (auto-filled by AI from the attack goal), choose the social engineering tactics (authority impersonation, fear/urgency, commitment escalation, or a full recommended strategy the platform generates automatically), and configure the conversation tone.

For a hybrid attack, the same workflow generates a coordinated phishing email with a trackable link. Both the call and the email launch as a single campaign.

Before launching, the admin can test the simulation in-browser, hearing how the AI responds in real time, checking the voice quality, and adjusting the scenario if needed. When deployed, the AI agent conducts live phone calls using the cloned voice, adapts to what the target says, and maintains the configured persona throughout. Employees who invoke the verification protocol pass. Those who comply with the request before verifying fail and receive automatic follow-up training.

The vishing dashboard tracks answer rates, fail rates by time period, and median call duration, giving security teams the data to show board-level improvement over time and flag high-risk employees for targeted follow-up.

Try our vishing simulator

Experience the most advanced voice phishing simulator built for security teams. Create scenarios, test voice cloning, and explore automation features.

Five Controls Every Finance Team Should Have in Place

These five controls don't require new technology. They require process discipline and simulation-based training to make sure they hold under pressure.

  1. Out-of-band callback rule. Any financial request that arrives by phone must be verified by calling back on a number from the company's own directory, not a number provided during the call. This single control stops most deepfake BEC attacks cold. It only works if employees have practiced it and won't skip it under urgency pressure.

  2. Pre-shared challenge phrases. Rotating code words, known only to the executive and key finance staff, that can be used to verify identity during sensitive requests. The Ferrari case demonstrated exactly why this works: the attacker couldn't answer a personal question the target asked, and the fraud was stopped.

  3. Dual-approval for high-value transfers. Any wire transfer above a defined threshold requires sign-off from two authorized officers before execution. This removes the single point of failure that attackers target.

  4. Mandatory time buffer. A required 15–30 minute delay on all high-value transactions removes the urgency pressure that makes deepfake BEC effective. If the CFO's voice clone calls and says "this needs to happen in the next ten minutes," a mandatory waiting period means that pressure has nowhere to go.

  5. Dual-channel confirmation rule. Any sensitive action must be confirmed through at least two independent channels. A request that arrives on one channel isn't approved without verification on a separate one. An email followed by a call from the same attacker is still one attacker; dual-channel confirmation with a number the employee supplies, not the caller, breaks the attack chain.

These controls protect only as much as the behavior they produce. Policy documents describe the controls. Simulation tests whether employees actually invoke them when a convincing AI voice is applying pressure on the other end of the line.

Three Steps to Assess Your Exposure Right Now

Step 1: Audit your current training program. Ask whether it includes any simulation of a live AI voice call targeting finance staff. If it tests only email phishing, it isn't testing the attack vector that carries the largest per-incident financial loss. That's a measurable gap in your security posture.

Step 2: Test the verification chain, not just awareness. Run a realistic deepfake drill against your accounts payable team using the CFO's actual cloned voice. Does the employee invoke the callback protocol? Do they know which number to call? Does the protocol hold when the "CFO" says the callback isn't necessary because the deal closes in twenty minutes? The only way to know is to run the drill.

Step 3: Review your cyber insurance policy before an incident. Ask your broker whether voice-cloning-originated wire fraud is covered under your current policy. Many policies that cover email-based BEC explicitly exclude deepfake fraud. Find out now, while there's time to act on the answer.

Brightside's vishing simulator lets you run a deepfake BEC drill using your CFO's actual cloned voice, so your finance team practices the exact response they'll need when a real call arrives.