Deep learning turns mono recordings into immersive sound Blog

Deep learning turns mono recordings into immersive sound

by 7wData
December 31, 2018

Listen to a bird singing in a nearby tree, and you can relatively quickly identify its approximate location without looking. Listen to the roar of a car engine as you cross the road, and you can usually tell immediately whether it is behind you.

The human ability to locate a sound in three-dimensional space is extraordinary. The phenomenon is well understood—it is the result of the asymmetric shape of our ears and the distance between them.

But while researchers have learned how to create 3D images that easily fool our visual systems, nobody has found a satisfactory way to create synthetic 3D sounds that convincingly fool our aural systems.

Today, that looks set to change at least in part, thanks to the work of Ruohan Gao at the University of Texas at and Kristen Grauman at Facebook Research. They have used a trick that humans also exploit to teach an AI system to convert ordinary mono sounds into pretty good 3D sound. The researchers call it 2.5D sound.

First some background. The brain uses a variety of clues to work out where a sound is coming from in 3D space. One important clue is the difference between a sound’s arrival times at each ear—the interaural time difference.

A sound produced on your left will obviously arrive at your left ear before the right. And although you are not conscious of this difference, the brain uses it to determine where the sound has come from.

Another clue is the difference in volume. This same sound will be louder in the left ear than in the right, and the brain uses this information as well to make its reckoning. This is called the interaural level difference.

These differences depend on the distance between the ears. Stereo recordings do not reproduce this effect, because the separation of stereo microphones does not match it.

The way sound interacts with ear flaps is also important. The flaps distort the sound in ways that depend on the direction it arrives from. For example, a sound from the front reaches the ear canal before hitting the ear flap. By contrast, the same sound coming from behind the head is distorted by the ear flap before it reaches the ear canal.

The brain can sense these differences too. In fact, the asymmetric shape of the ear is the reason we can tell when a sound is coming from above, for example, or from many other directions.

The trick to reproducing 3D sound artificially is to reproduce the effect that all this geometry has on sound. And that’s a tough problem.

One way to measure the distortion is with binaural recording. This is a recording made by placing a microphone inside each ear, which can pick up these tiny variations.

By analyzing the variations, researchers can then reproduce them using a mathematical algorithm known as a head-related transfer function. That turns any ordinary pair of headphones into extraordinary 3D sound machines.

But because everybody’s ears are different, everybody hears sound in a different way.

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Deep learning turns mono recordings into immersive sound

Leave a Reply Cancel reply

Upcoming Events

MarkLogic World | Amsterdam

Knowledge Graph — The Ultimate Center of Excellence

From Text to Value: Pairing Text Analytics and Generative AI

Bringing Data Closer to Decision Makers with Data Fabric

Categories

Tags

You Might Be Interested In

Using Artificial Intelligence To Fix Healthcare

What the Future of Fintech Looks Like

Insight platforms as a service: What they are and why they matter

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

IT Engineer

Data Engineer

Applications Developer

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

Deep learning turns mono recordings into immersive sound

Leave a Reply Cancel reply

Upcoming Events

Categories

Tags

You Might Be Interested In

Recent Jobs

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

To Drive Analytics Adoption
And manage change