technology

Vocal Mimicry Using Lyrebird Technology

Lyrebird has created a voice imitation technology that uses deep learning and artificial neural networks to create fascinating and somewhat scary results. It relies on deep learning models developed at the MILA lab of the University of Montréal.

Lyrebird will offer an API to copy the voice of anyone. It will need as little as one minute of audio recording of a speaker to compute a unique key defining her/his voice. This key will then allow to generate anything from its corresponding voice. The API will be robust enough to learn from noisy recordings. Lyrebird will offer a large catalog of different voices and let the user design their own unique voices tailored for their needs.

Users will be able to create entire dialogs with the new or mimicked voice. Inflection, emotion, and content can all be tailored as necessary through a developer API. The demos are fairly impressive but still distinctly robotic. Check out some Trump/Obama examples below.

[soundcloud url=”https://api.soundcloud.com/playlists/318022504″ params=”auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=true&visual=true” width=”100%” height=”300″ iframe=”true” /]

I’m interested to find out how accurate this will be for non-english and non-human verbal communication.

Zoom In. Now… Enhance! (For Real, Kinda)

The Zoom And Enhance trope has long been the ultimate criminal identification solution and a staple for crime drama television. Its use on screen is often lauded as an example of how Hollywood doesn’t understand technology. The Enhance Button trope simply ignores that the blurry focus and big blocky pixels you get when you zoom in close on an image are the only information that the picture actually contains, and attempting to extract more detail from the image alone is essentially impossible.

Enhance Old Station

Enhance Bank Lobby

However, as a proof of concept, Alex J. Champandard’s Neural Enhance coding project uses deep learning to enhance the details of images. As seen from the gifs above, if the neural networks are well trained, the enhancements are quite effective.

Thanks to deep learning and #NeuralEnhance, it’s now possible to train a neural network to zoom into your images at 2x or even 4x. You’ll get even better results by increasing the number of neurons or training with a dataset similar to your low-resolution image. The catch? The neural network is hallucinating details based on its training from example images. It’s not reconstructing your photo exactly as it would have been if it was HD. That’s only possible in Holywood — but using deep learning as “Creative AI” works and it’s just as cool!

Now let’s vector in and enlarge the z-axis.

via prosthetic knowledge

CPU Vs GPU

Pete Warden has an informative write up about the computing differences between the CPU and the GPU for the layman. It’s a simplified yet illuminating description. Go learn something!

Graphics Processing Units were created to draw images, text, and geometry onto the screen. This means they’re designed very differently than the CPUs that run applications. CPUs need to be good at following very complex recipes of instructions so they can deal with all sorts of user inputs and switch between tasks rapidly. GPUs are much more specialized. They only need to do a limited range of things, but each job they’re given can involve touching millions of memory locations in one go.

Some Say The World Will End In Fire, Some Say In Ice

This comic by Stuart McMillen is an adaptation from Neil Postman’s book Amusing Ourselves to Death: Public Discourse in the Age of Show Business. It compares Aldous Huxley’s “Brave New World” with George Orwell’s “1984”. With the recent revelations of NSA surveillance, I think the jury is still out on which vision is more correct. I think both Huxley and Orwell were right – the iron fist of government and the attention-sapping distractions of technology are dangers to modern society. The whole thing resonates quite loudly in today’s internet landscape.

Amusing Ourselves To Death

Museum Of Endangered Sounds

I love the idea of saving sounds from extinction. Marybeth Ledesma, Phil Hadad and Greg Elwood (under the guise of Brendan Chilcutt) have created and curated the online Museum Of Endangered Sounds. It’s an audio archive of yesteryear’s gadgets and electronics. Without the museum the sounds of analog cameras, dot matrix printers, dial-up modems, Speak & Spells, and floppy disks would have died a silent death. But now I have them archived for my own nostalgic musing. We would have failed as a generation if we didn’t try to preserve and then force our past on the youth of today. Long live Museum Of Endangered Sounds.

(via Alan Cooper)

Scroll to Top