Top 11 Tools For Distributed Machine Learning
- by 7wData
There are two fundamentally different and complementary ways of accelerating machine learning workloads:Â
2. By horizontal scaling or scaling-out, where one adds more nodes to the system
But when it comes to the degree of distribution within a machine learning ecosystem, they are classified as:
Centralised systems employ a strictly hierarchical approach. But the distributed system consists of a network of independent nodes and where no specific roles are assigned to certain nodes.
A centralised solution is not the right choice when data is inherently distributed or too big to store on single machines. For instance, think about astronomical data that is too large to move and centralise.
In a recent work published by the researchers at Delft University of Technology, Netherlands, they wrote in detail about the current state-of-the-art distributed ML models and how they affect computation latency and other attributes.
The advantages of using distributed ML models are plenty, it is beyond the scope of this article, however, here we list down of popular toolkits and techniques that enable distributed machine learning:
MapReduce is a framework for processing data and was developed by Google in order to process data in a distributed setting. First, all data is split into tuples during the map phase, which is followed by the reduce phase, where these tuples are grouped to generate a single output value per key. MapReduce and Hadoop heavily rely on the distributed file system in every phase of the execution.Â
Transformations in linear algebra, as they occur in many machine learning algorithms, are typically highly iterative in nature and the paradigm of the map and the reduce operations are not ideal for such iterative tasks. This is what Apache Spark has been developed to resolve.
The key difference here is the MapReduce tasks, which would require to write all (intermediate) data to disk for it to be executed. Whereas, Spark can keep all the data in memory, which saves expensive reads from the disk.
AllReduce uses common high-performance computing technology to iteratively train stochastic gradient descent models on separate mini-batches of the training data. Baidu claims linear speedup when applying this technique in order to train deep learning networks.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More