Strata HadoopWorld Fall 2016 postmortem: Maybe AI’s the future, but can we make the data science work?

Strata HadoopWorld Fall 2016 postmortem: Maybe AI's the future

Given all the hype over artificial intelligence (AI) these days, at first glance it would seem surprising that it appeared as almost an afterthought at Strata last week.

There were a handful of product announcements, like Maana, which added semantic search-like capabilities in its newest release of its knowledge management platform for resource-intensive industries like oil and gas; and Splunk, which grafted machine learning to its offerings for identifying and resolving incidents from IT system log files.

And in a keynote talk entitled "Connected Eyes," Microsoft's Joseph Sirosh spoke of a project with India's leading eye institute that applied machine learning over large patient populations to improve outcomes for eye surgery.

But this obscures the bigger picture. Conference sponsor O'Reilly acknowledged this by breaking out AI into a separate pre-event track the day before. And anyway, this wasn't a Google Cloud event, where AI was front and center.

So, get used to it. There's plenty of hype going around whether AI can, will, or should replace humans (spoiler alert: the answers are "not"). But even if present-day AI is no smarter than a bunch of idiot savants, there are plenty of practical and often unglamorous jobs that AI's core ingredient, machine learning (ML), is already performing.

Last year at Strata, we saw ML becoming almost ubiquitous in tooling for data management and governance of data lakes from providers from A to Z.

The rationale for using ML, rather than static governance rules, is due to the nature of data lakes. Unlike data warehouses, you won't know exactly what data will flow in, and so therefore, it won't be practical to build rules ahead of time dictating schema, data quality, de-duping, or identifying what data is likely to be sensitive (even weblogs could give PII data away).

Governance, whether it involves preparing data, building a catalog, and identifying master or reference data may be a moving target requiring the system to "learn" how the norms are changing.

And there's ML elsewhere as well. Providers like Cloudera build ML into the trouble ticket tracking that backs the automated "phone home" function of subscriber client technical support.

As we noted with our take on DataRobot, there is a growing array of tools aimed at simplifying or accelerating different aspects of the lifecycle of building and deploying ML programs.

And ML is showing up in end user analytic tools that help humans parse the signals in data, wrangle it into shape, suggest which questions to ask, and help piece together the narrative.

In other words, when it comes to the packaged software tools that govern big data or analyze it, we're probably starting to take embedded machine learning for granted.

But what if your own data scientists want to get their own hands dirty? As we noted a few weeks back, there's a lot of pent up enthusiasm among R and Python programmers for ML, which many look at as the latest shiny, new thing.

But for all the enthusiasm, at least among Spark users, SQL and streaming are more frequent workloads according to the 2016 Spark Survey just released by Databricks.

 

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

3 Ways Big Data and IoT Can Improve Our Health in 2016

1 Jun, 2016

Just a few years ago, Big Data and the Internet of Things (IoT) were terms generally unheard of. In 2016 …

Read more

The Role of Machine Learning Technology in the Hotel Sector.

31 Jul, 2017

Any hotelier will say it is their people that make the difference, hoteliers take pride in their ability to offer …

Read more

Do You Really Need a Big Data Strategy?

24 Oct, 2016

Read this eGuide to discover the fundamental differences between iPaaS and dPaaS and how the innovative approach of dPaaS gets …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.