technology

Vocal Mimicry Using Lyrebird Technology

Lyrebird has created a voice imitation technology that uses deep learning and artificial neural networks to create fascinating and somewhat scary results. It relies on deep learning models developed at the MILA lab of the University of Montréal.

Lyrebird will offer an API to copy the voice of anyone. It will need as little as one minute of audio recording of a speaker to compute a unique key defining her/his voice. This key will then allow to generate anything from its corresponding voice. The API will be robust enough to learn from noisy recordings. Lyrebird will offer a large catalog of different voices and let the user design their own unique voices tailored for their needs.

Users will be able to create entire dialogs with the new or mimicked voice. Inflection, emotion, and content can all be tailored as necessary through a developer API. The demos are fairly impressive but still distinctly robotic. Check out some Trump/Obama examples below.

[soundcloud url=”https://api.soundcloud.com/playlists/318022504″ params=”auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=true&visual=true” width=”100%” height=”300″ iframe=”true” /]

I’m interested to find out how accurate this will be for non-english and non-human verbal communication.

Zoom In. Now… Enhance! (For Real, Kinda)

The Zoom And Enhance trope has long been the ultimate criminal identification solution and a staple for crime drama television. Its use on screen is often lauded as an example of how Hollywood doesn’t understand technology. The Enhance Button trope simply ignores that the blurry focus and big blocky pixels you get when you zoom in close on an image are the only information that the picture actually contains, and attempting to extract more detail from the image alone is essentially impossible.

Enhance Old Station

Enhance Bank Lobby

However, as a proof of concept, Alex J. Champandard’s Neural Enhance coding project uses deep learning to enhance the details of images. As seen from the gifs above, if the neural networks are well trained, the enhancements are quite effective.

Thanks to deep learning and #NeuralEnhance, it’s now possible to train a neural network to zoom into your images at 2x or even 4x. You’ll get even better results by increasing the number of neurons or training with a dataset similar to your low-resolution image. The catch? The neural network is hallucinating details based on its training from example images. It’s not reconstructing your photo exactly as it would have been if it was HD. That’s only possible in Holywood — but using deep learning as “Creative AI” works and it’s just as cool!

Now let’s vector in and enlarge the z-axis.

via prosthetic knowledge

CPU Vs GPU

Pete Warden has an informative write up about the computing differences between the CPU and the GPU for the layman. It’s a simplified yet illuminating description. Go learn something!

Graphics Processing Units were created to draw images, text, and geometry onto the screen. This means they’re designed very differently than the CPUs that run applications. CPUs need to be good at following very complex recipes of instructions so they can deal with all sorts of user inputs and switch between tasks rapidly. GPUs are much more specialized. They only need to do a limited range of things, but each job they’re given can involve touching millions of memory locations in one go.

Scroll to Top