What’s hot in AI: Deep reinforcement learning

What’s hot in AI: Deep reinforcement learning

Deep reinforcement learning (DRL) is an exciting area of AI research, with potential applicability to a variety of problem areas. Some see DRL as a path to artificial general intelligence, or AGI, because of how it mirrors human learning by exploring and receiving feedback from environments. Recent successes of DRL agents besting human video game players, the well-publicized defeat of a Go grandmaster at the hands of DeepMind’s AlphaGo, and demonstrations of bipedal agents learning to walk in simulation have all contributed to the general sense of enthusiasm about the field.

Unlike supervised Machine Learning, which trains models based on known-correct answers, in reinforcement learning, researchers train the model by having an agent interact with an environment. When the agent’s actions produce desired results, it gets positive feedback. For example, the agent gets a reward for scoring a point or winning a game. Put simply, researchers reinforce the agent’s good behaviors.

One of the key challenges in applying DRL to non-trivial problems is in constructing a reward function that encourages desired behaviors without undesirable side effects. When you get this wrong, all kinds of bad things can happen, including cheating behaviors. (Think of rewarding a robot maid on some visual measure of room cleanliness, just to teach the bot to sweep dirt under the furniture.)

It might be worth noting here that while deep reinforcement learning — “deep” referring to the fact that the underlying model is a deep neural network — is still a relatively new field, reinforcement learning has been around since the 1970s or earlier, depending on how you count. As Andrej Karpathy points out in his 2016 blog post, pivotal DRL research such as the AlphaGo paper and the Atari Deep Q-Learning paper are based on algorithms that have been around for a while, but with deep learning swapped in instead of other ways to approximate functions. Their use of deep learning is of course enabled by the explosion in inexpensive compute power we’ve seen over the past 20+ years.

The promise of DRL, along with Google’s 2014 acquisition of DeepMind for $500 million, has led to a number of startups hoping to capitalize on this technology. I’ve interviewed Bonsai founder Mark Hammond for the This Week in Machine Learning & AI podcast (disclosure: Bonsai is a client of mine). That company offers a development platform for applying deep reinforcement learning to a variety of industrial use cases. I spoke with University of California at Berkeley’s Pieter Abbeel on the topic as well. He’s since founded Embodied Intelligence, a still-stealthy startup looking to apply VR and DRL to robotics.

Osaro, backed by Jerry Yang, Peter Thiel, Sean Parker, and other boldface-named investors, is also looking to apply DRL in the industrial space. Meanwhile, Pit.ai is seeking to best traditional hedge funds by applying it to algorithmic trading, and DeepVu is addressing the challenge of managing complex enterprise supply chains.

As a result of increased interest in DRL, we’ve also seen the creation of new open source toolkits and environments for training DRL agents. Most of these frameworks are essentially special-purpose simulation tools or interfaces thereto. Here are some of the ones I’m tracking.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Digital Transformation helping Smart Cities flourish

26 Jul, 2016

Digital Transformation, Big data Analytics are the hottest terms around, with lot of confusion even in matured organizations. This is …

Read more

Soaking Up the Sun with Artificial Intelligence

5 Sep, 2022

Newswise — Team’s algorithm could lead to pivotal discovery of new materials for solar cells. The sun continuously transmits trillions …

Read more

Towards Location-Based Analytics

26 Jul, 2018

Juan Huerta is a contributing author to Making Data Meaningful. He is currently a Senior Data Scientist at PlaceIQ where …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.