Knowledge graphs beyond the hype: Getting knowledge in and out of graphs and databases

Knowledge graphs beyond the hype: Getting knowledge in and out of graphs and databases

Knowledge graphs are hyped. We can officially say this now, since Gartner included knowledge graphs in the 2018 hype cycle for emerging technologies. Though we did not have to wait for Gartner -- declaring this as the "Year of the Graph" was our opener for 2018. Like anyone active in the field, we see the opportunity, as well as the threat in this: With hype comes confusion.

Knowledge graphs are real. They have been for the last 20 years at least. Knowledge graphs, in their original definition and incarnation, have been about knowledge representation and reasoning. Things such as controlled vocabularies, taxonomies, schemas, and ontologies have all been part of this, built on a Semantic Web foundation of standards and practices.

So, what's changed? How come the likes of Airbnb, Amazon, Google, LinkedIn, Uber, and Zalando sport knowledge graphs in their core business? How come Amazon and Microsoft joined the crowd of graph database vendors with their latest products? And how can you make this work?

Knowledge graphs sound cool and all. But what are they, exactly? It may sound like a naive question, but actually getting definitions right is how you build a knowledge graph. From taxonomies to ontologies -- essentially, schemas and rules of varying complexity -- that's how people have been doing it for years.

RDF, the standard used to encode these schemas, has a graph structure. So, calling knowledge encoded on top of a graph structure a "knowledge graph" sounds natural. And the people doing this, the data modelers, have been called knowledge engineers, or ontologists.

There can be many applications for these knowledge graphs -- from cataloguing items, to data integration and publishing on the web, to complex reasoning. For some of the most prominent ones, you can look at schema.org, Airbnb, Amazon, Diffbot, Google, LinkedIn, Uber, and Zalando. This is why people seasoned in knowledge graphs sneer at the hype.

Like any data modeling, this is hard and complicated work. It must take into account many stakeholders and views of the world, manage provenance and schema drift, and so on. Add to the mix reasoning, and web scale, and things easily get out of hand, which may explain why up until recently, this approach was not the most popular in the real world.

Going schema-less, on the other hand, has been and still is popular. Going schema-less can get you started quickly; it's simpler and more flexible, at least up to a certain point. The simplicity of not using a schema can be deceiving though. Because, in the end, whatever your domain, a schema will exist. Schema-on-read? Fine. But no schema at all?

You may not know your schema well enough a priori. It may be complex, and it may evolve. But it will exist. So, ignoring or downplaying schema does not solve any problem, it only makes things worse. Issues will lurk, and cost you time and money, as they will hamper developers and analysts who will try to develop applications and derive insights on a fuzzy blob of data.

The point then is not to throw schema away, but to make it functional, flexible, and interchangeable. RDF is pretty good at this, as it also underlies standardized formats for data exchange, such as JSON-LD. RDF can also be used for lightweight schema and schema-less approaches, and data integration, by the way.

So, what's with the hype? How can a 20-year old technology be on the emerging slope of the infamous hype cycle? Hype is real, too, as is the reason for this. It's the same story as the meteoric rise of the AI hype: It's not so much that things have changed in the approach, it's more that the data and compute power are there now to make it work at scale.

Plus, the AI itself helps.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

What Constitutes a Perfect Data Team?

20 Sep, 2020

Data science is the most promising field in near future, with the advancement of technology and statistical models in recent …

Read more

When Big Data, IoT and Geospatial Technology Collide

18 Nov, 2016

Geospatial intelligence is ingrained in our daily lives. We use map apps on our smart phones, companies use location-based beacon …

Read more

Machine learning is changing the way retailers do business

17 Jan, 2021

In 2002, Target hired statistician Andrew Pole. His job was to use predictive analytics — a form of statistics that makes …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.