Using Google Vision AI’s Reverse Image Search To Richly Catalog Television News

Using Google Vision AI's Reverse Image Search To Richly Catalog Television News

Deep learning has revolutionized the machine understanding of imagery. Yet today’s image recognition models are still limited by the availability of large annotated training datasets upon which to build their libraries of recognized objects and activities. To address this, Google’s Vision AI API expands its native catalog of around 10,000 visually recognized objects and activities with the ability to perform the equivalent of a reverse Google Images search across the open Web and tally up the top topics used to caption the given image everywhere it has previously appeared, lending unprecedentedly rich context and understanding, even yielding unique labels for breaking news events. What might this process yield for a week of television news?

Google’s Vision AI API represents a unique hybrid between traditional Deep learning-based image labeling based on a library of previously trained models and the ability to leverage the open Web to annotate images based on the most common topics visually similar images are captioned with.

Using its Web Entities feature, the Vision AI API performs what amounts to a reverse Google Images search over the open Web, identifying images across the entire Web that look most similar to the given image. The API identifies roughly similar images, images that precisely match portions of the input image and images that are almost identical to the input image. Most importantly, the Vision AI API then takes the most similar images and identifies the major topics most commonly found in the textual captions of those similar images, returning a histogram of the most common related topics.

What makes this feature so powerful is that it essentially crowdsources the entire Web to describe a given image. Most importantly, it allows it to adapt in real-time to emerging visual narratives, recognizing that an otherwise unremarkable image refers to a specific event simply by looking at how it has been captioned across the Web.

Applied to television news, Web Entities offers the potential to enrich news coverage by additional detail about the events depicted on screen.

To explore what this might look like, CNN, MSNBC and Fox News and the morning and evening broadcasts of San Francisco affiliates KGO (ABC), KPIX (CBS), KNTV (NBC) and KQED (PBS) from April 15 to April 22, 2019, totaling 812 hours of television news, were analyzed using Google’s Vision AI image understanding API with all of its features enabled.

In all, the Vision AI API identified 167,937 distinct Web Entities. Top entities include “News” (31% of airtime), “Journalist” (26%), “Public Relations” (20%), “Donald Trump” (19%), “Photograph” (17%), “Video” (17%), “Fox News” (17%), “Public” (16%), “Television” (12%) and “Product” (11%). Entries for “Democratic Party,” “Republican Party” and “Robert Mueller” all also received around 8% of airtime each, while “Cathedral Notre-Dame de Paris” garnered 2.4% of airtime.

Despite relying entirely on finding visually similar images across the open Web for each one second preview frame of novel television news, the Vision AI API still managed to identify at least one label for 99.6% of total airtime.

Google’s image algorithms do not provide any form of facial recognition capability. Instead, the Vision AI API identified images of Donald Trump and Robert Mueller by seeing that specific video frames were highly similar to images across the Web that were most commonly captioned with those names. In fact, images of Robert Mueller often were additionally labeled as Donald Trump by the Vision AI API, reflecting the fact that imagery of the special council typically references his work investigating issues relating to the president.

Thus, Web Entities do not necessarily reflect the precise objects and activities depicted in an image, but rather how that image is captioned across the Web, meaning annotations may include strongly related subjects not visually depicted in the image itself.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Using Video and IoT for Business Continuity and Intelligence

30 Jul, 2020

When it comes to information gathering, one constant remains: video is central to the goal of assembling and disseminating intelligence …

Read more

How Can Modern Cities Manage Smart Mobility?

8 Dec, 2018

The Smart City Expo World Congress is a global meeting point for everyone with a stake in the future of …

Read more

Little data analytics

19 Apr, 2017

For years, the mantra in the world of business software and enterprise IT has been “data is the new gold.” …

Read more

Recent Jobs

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

D365 Business Analyst

South Bend, IN, USA

22 Apr, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.