Synthetic voice generated by Play.ht allows you to hear your favorite topics in voices from the past. In an episode posted by Podcast.ai, an AI-generated podcast platform, people can listen to (fake) Steve Jobs’s interview with (fake) Joe Rogan!
Based in Dubai, Play.ht created a podcast series called “Podcast.ai” whose first episode features digitally cloned voices of Joe Rogan and Steve Jobs. The interview begins with Rogan introducing Jobs as a guest who is “difficult to describe,” “weird,” and yet has made some “great technological products of our age,” all in a very realistic voice.
The replicated voice is generated by voice cloning technology using deep learning models, where existing samples are used. Rogan is a top candidate for deep learning AI voice training because his podcasts contain a lot of isolated recordings of his voice.
Further in the episode, listeners can hear Steve Jobs share his experience at Reed College, his courses, and learning about Hinduism and Buddhism. Jobs’ synthetic voice continues to describe how several odysseys and spiritual foundations come from the Indian subcontinent.
Read More: Apple introduces Ask Apple for developers to connect with experts
In a later section, listeners can also hear him re-iterate the launching of the Macintosh and his criticism of Microsoft. The section is very akin to what (real) Steve Jobs delivered in an interview in 1995. Following the critique, Jobs explains how Apple was formed with a vision of working to improve its products in the “long run” and seeking their betterment with every next thing that is produced.
After hearing the entire episode, one can realize that while the synthetic voices are similar, they are not entirely indistinguishable. Listeners can tell the difference between real and fake if they compare.
It’s unclear whether it is allowed to use Rogan or Jobs’ voices in this way, especially to advertise a business product. People are looking at a future when media artifacts from any past will probably be entirely fluid and flexible, suitable to fit any narrative, with speech synthesis becoming more common and possibly undetectable.