What you need to know about Cloudera vs. AWS for big data
- by 7wData
When it merged with fellow big data management vendor Hortonworks in January 2019, Cloudera Inc. gained a better chance to compete with cloud providers' Hadoop offerings -- setting up an AWS faceoff.
The upcoming Cloudera Data Platform (CDP) will be an open source, cloud-hosted big data offering meant to challenge Amazon Elastic MapReduce (EMR) -- AWS' Hadoop service -- and other cloud-oriented big data analytics applications also built on Hadoop. CDP does not have a release date yet.
Cloudera also partnered with IBM in June 2019 to collaborate on big data and AI offerings and resell each other's services: Cloudera Enterprise Data Hub and DataFlow as well as IBM Watson Studio and Big SQL.
Let's take a look at what this Cloudera and IBM partnership might mean for users with big data workloads on the cloud and how CDP changes the contest of Cloudera vs. Amazon EMR.
The Cloudera and IBM partnership is first a reaffirmation of the Hortonworks-IBM partnership prior to the Cloudera merger, said Dave Mariani, founder and chief strategy officer at data warehouse virtualization provider AtScale. Before they merged, Cloudera and Hortonworks focused on the Hadoop file system and tools for large data lakes. With these capabilities, enterprises could save all their data in one place and repurpose it for various analytics and AI purposes. In practice, though, enterprises have struggled with Hadoop performance problems, and as a result, many enterprises have turned to cloud providers to outfit their data management fabric. Post-merger, Cloudera's partnership with IBM could help enterprise customers address Hadoop performance problems through IBM's extensive service and support organization and partnerships. In contrast, AWS provides a comprehensive set of tools for automating many aspects of big data deployments and is an attractive choice for companies with AWS development and deployment skills.
The Cloudera and IBM partnership and CDP offering should be most attractive to companies entering the early stages of a big data analytics strategy that have data and applications spread across on-premises and cloud environments. It is not likely to draw companies with a substantial AWS presence and skill set. In partnering with IBM, Cloudera has tied itself to IBM's hybrid and multi-cloud agenda. Therefore, Cloudera and IBM should be the best fit for enterprises with a hybrid cloud data strategy, Mariani said. IBM asserts that a hybrid or multi-cloud approach is more realistic than locking in to one provider, he said. IBM's approach to supporting modern app development is to use Kubernetes and containers so that workloads can run anywhere: on premises, private cloud or public cloud. AWS, on the other hand, wants all workloads to run only on its cloud.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More