Machine learning demystified: the importance of data
- by 7wData
Machine learning (ML) may sound like a daunting concept to anyone unfamiliar with it, some may believe it to lead to outlandish ideas about machines poised to enslave mankind. Fortunately this isn’t what ML is, it’s basically a major advancement in the development of Information Technology (IT). For ML to benefit an organisation it first has to understand the full benefit and limitations it offers.
While the principles of ML are rather simple and intuitive to grasp, it does require the use of specific statistical and IT skills that few people currently possess. To understand the idea think of a common and rather mundane language translation service – like Google Translate – this helped me realise the transformative potential of ML.
To simplify it, language translation software has long been based on programming dictionaries, grammatical rule and their numerous exceptions. This approach involves considerable effort.
From ‘rule-based’ to ‘data-driven’ processes
The new methodology stemmed from a simpler idea: don’t try to define rule and lexical tables from scratch, let the software discover them. How?
In three steps:
A collection of millions of pages, already translated from one language to another, are collected from international organisations. These include documentation available online from, for example, the UN or European institutions.
When a user submits text for translation, the software slices it into basic elements and then searches for similar ones in the same language.
The most likely translation is the extracted from the bilingual corpus which is suggested to the user. Relevant statistical patterns found in the data, therefore, replace translation rules. Instead of having to be painstakingly programmed, they are simply “learned” by the software. This approach is highly cost efficient and the quality of the translation is often on par with a traditional approach.
In areas less complex than translating human languages, the productivity gains are compounded by substantial quality improvement. Anyone who’s worked on software knows how complex it can be to anticipate all the potential problems once it’s entered production.
The software’s functional rules are based on assumptions that are limited to a linear number of observations. Reality often proves to be far more complex than expected, meaning automation is eventually suboptimal or the software ends up requiring expensive corrections.
Machine learning on the other hand absorbs and develops itself using all available data, regardless of the volume. This means the risk of patterns or a use case being left out of the picture is therefore limited.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Shift Difficult Problems Left with Graph Analysis on Streaming Data
29 April 2024
12 PM ET – 1 PM ET
Read MoreCategories
You Might Be Interested In
Can AIOps Save IT Management?
17 Mar, 2019IT professionals are struggling to keep up with the demands of today’s modern applications and infrastructure. Rising use of containers …
Selling Hadoop to the C-Suite: It’s all about business value
4 Oct, 2016With all the talk about Big Data, most organizations are barely out of the starting blocks when it comes to …
The Role of AI in Assisting Customer Experience
5 Jan, 2021From being the plots of sci-fi thrillers to being seen as threats by the working populace, Artificial Intelligence (AI) has …
Recent Jobs
Do You Want to Share Your Story?
Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.