Compute to data: using blockchain to decentralize data science and AI with the Ocean Protocol

Compute to data: using blockchain to decentralize data science and AI with the Ocean Protocol

AI and its machine learning algorithms need data to work. By now, that's a known fact. It's not that algorithms don't matter, it's just that typically, getting more data, better data, helps come up with better results more than tweaking algorithms. The unreasonable effectiveness of data.

More data, and more compute capacity to train algorithms that use the data, is what has been fueling the rise of AI. Anyone who wants to train an algorithm for an AI application to address any problem in any domain must be able to get lots of relevant data in order to be successful.

That data can be public data, private data generated and owned by the organization developing the application, or private data acquired by 3rd parties. Public data is not an issue. Privately owned private data must be collected and processed in accordance with data protection laws such as GDPR and CCPA.

But what about private data owned by 3rd parties? Normally, application developers don't have access to those, and for good reasons. Why would you trust anyone with your private data? Even if the party you hand it over to promises to take good care of the data, once the data is out of your hands, anyone can do as they please with it.

This is the problem the non-profit Ocean Protocol Foundation (OPF) wants to solve. ZDNet connected with Founder Trent McConaghy, to discuss OPF's mission and the latest milestone achieved - Compute-to-Data.

McConaghy has been working on the Ocean Protocol since 2017. McConaghy has a background in AI and blockchain, having worked in projects such as ascribe and BigchainDB. He described how he realized that blockchain could help solve the issue of data escapes and privacy for data used to train AI algorithms.

The OPF has been working on setting up the infrastructure to enable better accessibility to data via data marketplaces. As McConaghy pointed out, there have been many attempts of data marketplaces in the past, but they've always been custodial, which means the data marketplace is a middlemen users have to trust. Recent case in point - Surgisphere.

But what if you could have marketplaces act as the connector without them actually holding the data, without having to trust the marketplace? This is what OPF is out to achieve - decentralized data marketplaces.

This is a tall order, and McConaghy is fast to admit that it will take years to get there. Last week, however, brought the OPF one step closer, by unveiling what it calls Compute-to-Data. Compute-to-Data provides a means to exchange data while preserving privacy by allowing the data to stay on-premise with the data provider, allowing data consumers to run compute jobs on the data to train AI models.

Rather than having the data sent to where the algorithm runs, the algorithm runs where the data is. The idea is very similar to federated learning. The difference, McConaghy says, is that federated learning only decentralizes the last mile of the process, while Compute-to-Data goes all the way.

TensorFlow Federated (TFF) and OpenMined are the most prominent federated learning projects. TFF does orchestration in a centralized fashion, OpenMined is decentralized. In TFF-style federated learning a centralized entity (e.g. Google) must perform the orchestration of compute jobs across silos. Personally identifiable information can leak to this entity.

OpenMined addresses this via decentralized orchestration. But its software infrastructure could use improvement to manage computation at each silo in a more secure fashion; this is where Compute-to-Data can help, says McConaghy. That's all fine and well, but what about performance?

If algorithms run where the data is, then this means how fast they will run depends on the resources available at the host.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

5 Ways Hybrid Smart Contracts Are Changing the Blockchain Industry

15 Sep, 2021

For years, the blockchain industry has been defined by the excitement around smart contracts, or tamper-proof digital agreements that automatically …

Read more

Why data scientists are vital for increasing customer loyalty

11 Apr, 2017

The importance of big data is increasing across all industries — as are the jobs required to tap into the …

Read more

The new data highway your city really needs

11 Apr, 2017

Data by itself doesn’t make a city smart. What makes a city smart is its ability to act on it …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.