How Voice Cloning Is the Next Big AI Threat

Written by Leo S.F. Lin et al., on October 17, 2024. Posted in Current News

The rapid development of artificial intelligence (AI) has brought both benefits and risk.

One concerning trend is the misuse of voice cloning. In seconds, scammers can clone a voice and trick people into thinking a friend or a family member urgently needs money.

News outlets, including CNN, warn these types of scams have the potential to impact millions of people.

As technology makes it easier for criminals to invade our personal spaces, staying cautious about its use is more important than ever.

What is voice cloning?

The rise of AI has created possibilities for image, text, voice generation and machine learning.

While AI offers many benefits, it also provides fraudsters new methods to exploit individuals for money.

You may have heard of “deepfakes,” where AI is used to create fake images, videos and even audio, often involving celebrities or politicians.

Voice cloning, a type of deepfake technology, creates a digital replica of a person’s voice by capturing their speech patterns, accent and breathing from brief audio samples.

Once the speech pattern is captured, an AI voice generator can convert text input into highly realistic speech resembling the targeted person’s voice.

With advancing technology, voice cloning can be accomplished with just a three-second audio sample.

While a simple phrase like “hello, is anyone there?” can lead to a voice cloning scam, a longer conversation helps scammers capture more vocal details. It is therefore best to keep calls brief until you are sure of the caller’s identity.

Voice cloning has valuable applications in entertainment and health care – enabling remote voice work for artists (even posthumously) and assisting people with speech disabilities.

However, it raises serious privacy and security concerns, underscoring the need for safeguards.

How it’s being exploited by criminals

Cybercriminals exploit voice cloning technology to impersonate celebrities, authorities or ordinary people for fraud.

They create urgency, gain the victim’s trust and request money via gift cards, wire transfers or cryptocurrency.

The process begins by collecting audio samples from sources like YouTube and TikTok.

Next, the technology analyses the audio to generate new recordings.

Once the voice is cloned, it can be used in deceptive communications, often accompanied by spoofing Caller ID to appear trustworthy.

Many voice cloning scam cases have made headlines.

For example, criminals cloned the voice of a company director in the United Arab Emirates to orchestrate a $A51 million heist.

A businessman in Mumbai fell victim to a voice cloning scam involving a fake call from the Indian Embassy in Dubai.

In Australia recently, scammers employed a voice clone of Queensland Premier Steven Miles to attempt to trick people to invest in Bitcoin.

Teenagers and children are also targeted. In a kidnapping scam in the United States, a teenager’s voice was cloned and her parents manipulated into complying with demands.

How widespread is it?

Recent research shows 28% of adults in the United Kingdom faced voice cloning scams last year, with 46% unaware of the existence of this type of scam.

It highlights a significant knowledge gap, leaving millions at risk of fraud.

In 2022, almost 240,000 Australians reported being victims of voice cloning scams, leading to a financial loss of $A568 million.

How people and organisations can safeguard against it

The risks posed by voice cloning require a multidisciplinary response.

People and organisations can implement several measures to safeguard against the misuse of voice cloning technology.

First, public awareness campaigns and education can help protect people and organisations and mitigate these types of fraud.

Public-private collaboration can provide clear information and consent options for voice cloning.

Second, people and organisations should look to use biometric security with liveness detection, which is new technology that can recognise and verify a live voice as opposed to a fake. And organisations using voice recognition should consider adopting multi-factor authentication.

Third, enhancing investigative capability against voice cloning is another crucial measure for law enforcement.

Finally, accurate and updated regulations for countries are needed for managing associated risks.

Australian law enforcement recognises the potential benefits of AI.

Yet, concerns about the “dark side” of this technology have prompted calls for research into the criminal use of “artificial intelligence for victim targeting.”

The are are also calls for possible intervention strategies that law enforcement could use to combat this problem.

Such efforts should connect with the overall National Plan to Combat Cybercrime, which focuses on proactive, reactive and restorative strategies.

That national plan stipulates a duty of care for service providers, reflected in the Australian government’s new legislation to safeguard the public and small businesses.

The legislation aims for new obligations to prevent, detect, report and disrupt scams.

This will apply to regulated organisations such as telcos, banks and digital platform providers. The goal is to protect customers by preventing, detecting, reporting, and disrupting cyber scams involving deception.

Reducing the risk

As cybercrime costs the Australian economy an estimated A$42 billion, public awareness and strong safeguards are essentials.

Countries like Australia are recognising the growing risk. The effectiveness of measures against voice cloning and other frauds depends on their adaptability, cost, feasibility and regulatory compliance.

All stakeholders — government, citizens, and law enforcement — must stay vigilant and raise public awareness to reduce the risk of victimisation.

See more here Science Alert

Please Donate Below To Support Our Ongoing Work To Defend The Scientific Method

PRINCIPIA SCIENTIFIC INTERNATI ONAL, legally registered in the UK as a company incorporated for charitable purposes. Head Office: 27 Old Gloucester Street, London WC1N 3AX.

Trackback from your site.

Comments (2)

Howdy

October 17, 2024 at 5:29 pm | #

There are plenty of these things going online now for your delectation.

3 seconds? I seriously doubt that will be anything useful for a full on conversation. Everywhere I’ve looked, even ‘fun’ situations require 30 seconds, to minutes.
Like phone scams, If you’re worried about security, or realism, change the target of the conversation to a subject the caller should know. Otherwise, just don’t give out personal information over the phone etc. Ask for the request on an official letterhead if applicable.

Don’t forget, your voice samples they use for training need to be recorded to a sufficient standard, so how did they get it in the first place is a more pertinent question if you ask me.

“If ‘mum’ calls asking for money – would you really question if it’s her? It sounds like her but something’s out of character. This is where a Safe Phrase would come in handy, as this could be an AI voice cloning scam.”
“A Safe Phrase is a previously agreed phrase that you and your inner circle can use to verify if you’re truly speaking to them.”
https://www.starlingbank.com/safe-phrases/

It’s a bit paranoid don’t you think? But, familiarity breeds insecurity. Insist on written word at least. How did the scammer get your number? Divert the caller to another member of the family etc. There’s allways text messaging too.

In the end, how much do you trust technology over actual presence of a person…

Reply
Tom

October 17, 2024 at 6:41 pm | #

What happens is that A/I cloned voices are able to break through security layers pretending to be a real person. Some have had their phone account stolen by this swindle. A/I is going to become extremely destructive in the wrong hands.

Reply