Why the future of AI is flexible, reusable foundation models

Why the future of AI is flexible

When learning a different language, the easiest way to get started is with fill in the blank exercises. “It’s raining cats and …”

By making mistakes and correcting them, your brain (which linguists agree is hardwired for language learning) starts discovering patterns in grammar, vocabulary, and word sequence — which can not only be applied to filling in blanks, but also to convey meaning to other humans (or computers, dogs, etc.).

That last bit is important when talking about so-called ‘Foundation models,’ one of the hottest (but underreported) topics in Artificial Intelligence right now.

According to a review paper from 2021, Foundation models are, “trained on broad data (generally using self-supervision at scale) that can be adapted to a wide range of downstream tasks.”

In non-academic language, much like studying fill in the blank exercises, foundation models learn things in a way that can later be applied to other tasks, making them more flexible than current AI models.

The way foundation models are trained solves one of the biggest bottlenecks in AI: labeling data.

When (to prove you’re not a robot) a website asks you to select “all the pictures containing a boat,” you’re essentially labeling. This label can then be used to feed images of boats to an algorithm so it can, at some point, reliably recognize boats on its own. This is traditionally how AI models are trained; using data labeled by humans. It’s a time-consuming process and requires many humans to label data.

Foundation models don’t need this type of labeling. Instead of relying on human annotation, they use the fill in the blanks method and self-generated feedback to continuously learn and improve performance, without the need for human supervision.

This makes foundation models more accessible for industries that don’t already have a wide-range of data available. In fact, according to Dakshi Agrawal, IBM Fellow and CTO at IBM AI, depending on the domain you’re training a foundation model in, a few gigabytes of data can suffice.

These complex models might sound far removed from a user like you, but you’ve almost certainly seen a foundation model at work at some point online. Some of the more famous ones are the GPT-3 language model, which, after being fed works by famous writers, can produce remarkable imitations, or DALL-E, which produces stunning images based on users’ prompts.

Beyond creating new entertainment, the flexibility that foundation models bring could help accelerate groundbreaking medical research, scientific advances, engineering, architecture, and even programming.

Foundation models are characterized by two very interesting properties: Emergence and homogenization.

Emergence means new unexpected properties that models show which were not available in previous generations. It typically happens when model sizes grow. A language model doing basic arithmetic reasoning is an example of an emergent property of a model which is somewhat unexpected.

Homogenization is a complicated term for a model that’s trained to understand and use the English language to perform different tasks. This could include summarizing a piece of text, outputting a poem in the style of a famous writer or interpreting a command given by a human (the GPT-3 language model is a good example of this).

But foundation models are not limited to human language. In essence, what we’re teaching a computer to do is to find patterns in processes or phenomena that it can then replicate given a certain condition.

Let’s unpack that with an example. Take molecules. Physics and chemistry dictate that molecules can exist only in certain configurations. The next step would be to define a use for molecules, such as medicines. A foundation model can then be trained, using reams of medical data, to understand how different molecules (i.e. drugs) interact with the human body when treating diseases.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

How Pinterest reached 150 million monthly users (hint: it involves machine learning)

12 Nov, 2016

On Pinterest.com, I find myself being summoned from various directions, as if I’ve just stepped into a party with my …

Read more

Your Instinct Still Matters – Why Data Alone Isn’t Enough to Drive Decisions

15 Dec, 2021

Netflix’s Reed Hastings leans on it. So does SoftBank founder Masayoshi Son. As did Howard Schultz when he ran Starbucks. …

Read more

5 Predictions for 2019: Business Value From Data

9 Nov, 2018

You’ll get more value out of your data projects and programs in 2019, according to five predictions from Forrester Research. …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.