Saturday, March 3, 2018

"Baidu’s voice cloning AI can swap genders and remove accents"

It looks as if I'm going to have to go through that whole "What is real?" thing again.

From The Next Web:
Chinese AI titan Baidu earlier this month announced its Deep Voice AI had learned some new tricks. Not only can it accurately clone an individual voice faster than ever, but now it knows how to make a British man sound like an American woman.

You can insert your own joke here.

The Baidu Deep Voice research team unveiled its novel AI capable of cloning a human voice with just 30 minutes of training material last year.  And since then it’s gotten much better at it: Deep Voice can do the same job with just a few seconds worth of audio now.
Here’s some audio of a human:...
The team revealed two separate training methods in a recently published white paper. In one of the models a more believable output is generated, but it takes additional audio input. The second model can generate cloned audio much faster but at lower quality.
Both are nominally faster than Baidu’s previous attempts with Deep Voice and, according to the researchers, could be upgraded even further with tweaked algorithms and broader datasets. The researchers claim, in a company blog post:
In terms of naturalness of the speech and similarity to the original speaker, both demonstrate good performance, even with very few cloning audios.
The purpose of the research is to demonstrate that machines can learn complex tasks with limited datasets, just like people. Imitating voices may be a specific use-case, but it’s important for researchers to find ways to minimize footprints through fine-tuning or replacing unwieldy algorithms....MORE