Lyrebird has created a voice imitation technology that uses deep learning and artificial neural networks to create fascinating and somewhat scary results. It relies on deep learning models developed at the MILA lab of the University of Montréal.
Lyrebird will offer an API to copy the voice of anyone. It will need as little as one minute of audio recording of a speaker to compute a unique key defining her/his voice. This key will then allow to generate anything from its corresponding voice. The API will be robust enough to learn from noisy recordings. Lyrebird will offer a large catalog of different voices and let the user design their own unique voices tailored for their needs.
Users will be able to create entire dialogs with the new or mimicked voice. Inflection, emotion, and content can all be tailored as necessary through a developer API. The demos are fairly impressive but still distinctly robotic. Check out some Trump/Obama examples below.
[soundcloud url=”https://api.soundcloud.com/playlists/318022504″ params=”auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=true&visual=true” width=”100%” height=”300″ iframe=”true” /]
I’m interested to find out how accurate this will be for non-english and non-human verbal communication.