Machine Learning – Can We Please Just Agree What This Means
- by 7wData
Summary: As a profession we do a pretty poor job of agreeing on good naming conventions for really important parts of our professional lives. “Machine Learning” is just the most recent case in point. It’s had a perfectly good definition for a very long time, but now the Deep learning folks are trying to hijack the term. Come on folks. Let’s make up our minds.
As a profession we do a pretty poor job of agreeing on good naming conventions for really important parts of our professional lives. How about ‘Big Data’? Terrible. It’s not about just size although if you asked most non-DS practitioners that’s what they’d say. Or how about ‘Data Scientist’. Nope. Can’t really agree on that one either.
Now we come to ‘Machine Learning’. If you asked 95 out of 100 data scientists, specifically those who are not doing Deep learning they would unanimously agree that this definition hasn’t changed over at least the last 15 years:
The application of any computer-enabled algorithm that can be applied against a data set to find a pattern in the data. This encompasses basically all types of data science algorithms, supervised, unsupervised, segmentation, classification, and regression including deep learning.
Increasingly though there are more and more articles written that hijack this term to mean only deep learning.
It’s natural that the most press is given to the newest and most exciting frontier developments but this is an unnecessary source of confusion. Deep learning specialists have variously argued that machine learning means only unsupervised systems (not unique to deep learning), or systems which automatically discover all features (not unique to deep learning), or simply that it’s synonymous with deep neural nets, or more specifically convolutional neural nets or recurrent neural nets (including LSTM).
Personally, I think the traditional, more inclusive definition is more descriptive and more valuable in describing what we do to non-practitioners.
This is a much used graphic going around these days, and while I might pick some nits with the labeling it’s broadly inclusive of both traditional predictive analytics, data viz, and what we think of today as AI, which is broader than just deep learning. In other words it’s the full scope of data science.
Well aside from the fact that it takes a much broader definition and makes it unnecessarily narrow; let’s look at some of the individual arguments.
Deep Learning is Different from Traditional Predictive Analytics
This is true is some respects but let’s look at the origins of deep learning, which is deep neural nets. Neural nets have been part of the predictive analytics toolset for decades and we’ve used them to solve complex regression and classification problems in supervised learning.
Not too many years ago our hardware simply couldn’t keep up with the computational complexity of NNs especially when we started to add hidden layers. But eventually hardware did catch up and we found we could answer both traditional supervised questions and some neat new unsupervised questions by adding more and more hidden layers.
In general, a NN architecture with more than two or three hidden layers is called a ‘deep neural net’ and is the origin of the phrase ‘deep learning’.
What about some of the other claims?
Not true. In fact the problem holding back many image recognition projects is the lack of labeled training datasets. The original success of the cat/not-a-cat CNN image system required millions of pictures of cats and not-cats all of which had to be labeled. Entire businesses like CrowdFlower have grown up around providing human-in-the-loop labeling strategies for just such problems.
Speech processing which is basically time series analysis using RNN/LSTM deep neural nets also has to be trained on known ‘good speech’.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Shift Difficult Problems Left with Graph Analysis on Streaming Data
29 April 2024
12 PM ET – 1 PM ET
Read MoreCategories
You Might Be Interested In
Understanding Data Visualization Techniques
18 Sep, 2020Data visualization is a graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools …
Azure mapping: the journey to the cloud begins with good cartography
31 Oct, 2019An explorer in his own space, famous British film director Peter Greenaway once noted the significance of maps and cartography. …
5 top data challenges that are changing the face of data centers
14 Dec, 2017Data is clearly not what it used to be! Organizations of all types are finding new uses for data as …
Recent Jobs
Do You Want to Share Your Story?
Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.