Voice as proof - that era is over
Between 2024 and 2026, AI voice cloning has gone from research toy to mass-market tool. Services that used to need weeks of training data now copy a voice from three seconds of publicly available audio - a LinkedIn video, a podcast clip, a YouTube talk. The clone is good enough to pass on a phone call.
Familiar voice ≠ verified person
Voice alone has never been proof of identity - less so now than ever. Verification happens through a channel, not through tone.
Call back on a known number
On any unusual request: hang up, call back yourself via a saved number. Not the number that called you.
Set up a code word
Agree on a code word with family and key colleagues. On critical calls: 'Say the word.'
How a typical deepfake call unfolds
- Research: The attacker finds your role on LinkedIn and identifies your CEO/manager.
- Sample: 3-10 seconds of voice audio from the target, e.g. from a podcast clip.
- Clone: A cloud service builds a voice model in minutes, often under USD 30.
- Call: Using the cloned voice as the CEO, the attacker calls you. Background noise (car, airport, meeting) masks small imperfections.
- Request: An urgent wire, a password reset, a data handoff. Under time pressure.
A Hong Kong finance employee transfers USD 25 million after a video conference with the "CFO" and several "colleagues". Every participant was a deepfake - voices, faces, micro-reactions included. No second-channel verification was ever attempted.
Tells that even good AI clones can't fake yet
Even as voices improve, signals remain:
- Breathing errors: AI voices breathe in unnatural spots, or not at all.
- Suspiciously clean audio: Real calls have ambient noise. A "studio-clean" call is a flag.
- Delayed reaction to interruption: Clones respond half a beat late, or repeat themselves mechanically.
- Inability to improvise: Ask about a shared memory, an insider detail. Clones fail at spontaneous context.
Three moves when a "known" call gets unusual
- Hang up politely: "Let me call you right back." No pretext needed.
- Verify on a second channel: Call the known number, ping Slack/Teams, walk over.
- Act only after verification: Even under pressure. Real bosses understand caution; scammers push.
Request for discretion
'Don't tell anyone.' Classic manipulation - real instructions tolerate transparency.
Voice perfect, no personal detail
Clones imitate voice, not spontaneous shared memory.
Pressure to act on the call
'Do it now, I'll stay on the line.' Legitimate contacts accept callbacks.
Caller ID matches - voice slightly 'off'
Spoofing the number is trivial. Voice + number together isn't proof.
The simple rule
A voice on the phone is a hint of identity - not proof. Proof requires a second, independent channel.
This rule held before AI. It is essential now.