Building a $4 billion company around open source software: The Cloudera story

Building a $4 billion company around open source software: The Cloudera story

Dr Amr Awadallah is the Chief Technology Officer of Cloudera, a data management and analytics platform based on Apache Hadoop. Before co-founding Cloudera in 2008, Awadallah served as Vice President of Product Intelligence Engineering at Yahoo!, running one of the very first organizations to use Hadoop for data analysis and business intelligence. Awadallah joined Yahoo! after the company acquired his first startup, VivaSmart, in July 2000.

With the fourth industrial revolution upon us—where the lines between the physical, digital and biological spheres are blurred by the world of big data and the fusion of technologies—Cloudera finds itself among the band of companies that are leading this change. In this interview with Enterprise Innovation, the Cloudera co-founder shares his insights on the opportunities and challenges in the digital revolution and its implications for businesses today; how organizations can derive maximum value from their data while ensuring their protection against risks; potential pitfalls and mistakes companies make when using big data for business advantage; and what lies beyond big data analytics.

Take us through the beginning of Cloudera, your time with VivaSmart, and what it was like to set up these companies.

They were very different processes. When VivaSmart was acquired by Yahoo! in mid-2000 for $9 million, it was mainly an “acqui-hire” because there were only five of us in the company and we were one of the few experts in terms of compression, which Yahoo! really needed for its shopping service. In retrospect, it was the right thing to do because back in 2000 when the Internet bubble burst, almost all our competition shut down and we were lucky to join Yahoo! when we did.

The lightbulb really went on for me in Yahoo!. I spent a total of eight years there—four were spent working on the compression shopping engine VivaSmart built, and four more on business intelligence and data analytics where I had a number of challenges in terms of scaling from a processing time perspective and a cost of storage perspective; we were deleting data we wanted to keep, and it was not advanced—it could only do SQL and we wanted to do predictive modeling, pattern matching, clustering, and other techniques that were very hard to do in SQL. I was lucky while I was at Yahoo! that Doug Cutting, who now also works at Cloudera, was working with the Yahoo Search team to build the Hadoop technology for Search. I was complaining about all the problems I had and he said to try Hadoop and see if it works for me. And it did! Within six months, all of my backend was switched to Hadoop, the processing time went down from nine hours to five minutes, the cost went down by almost 100x in some cases, and we gained the flexibility of being able to go beyond SQL and do more advanced stuff.

You were one of the first guys working on Hadoop…

We were the only Hadoop big data platform for two years. 

How did that business model evolve? 

That comes from Mike Olson, my co-founder and one of the very first open source CEOs. He had a company called Sleepy Cat, which was an in-memory database that was open source. He was very fundamental in charting the course of Cloudera in terms of how to create the business model around open source.

We knew from day one that the benefits of open source are extremely rapid innovation and lots of word of mouth, but the downside is obviously that it’s very easy for someone to copy your products, and in many cases customers themselves take the software and don’t want to be customers. Mike experienced that firsthand with his first startup, so when we were building out Cloudera, we always had it in our strategy to do a hybrid open source business model. We’ll keep the core platform and capabilities open, but build value around it that would make it easier, make it enterprise-ready, and make it more about performance…that’s how we created the differentiation against competition.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Big Data (or how to make Business Intelligence keep up with the new times)

19 Jun, 2017

Large and small companies have always used the information within their reach to better understand their business and try to …

Read more

Building new organisational models to achieve true digital transformation

4 Jan, 2018

Every organisation should be developing their digital workplace. It’s not about one, single solution, but about understanding all the many …

Read more

3 Ways To Make Conversational AI Work For Your Organization

24 Jul, 2021

There’s little doubt that conversational AI is gaining momentum. The question is this: How can you use it to deliver …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.