A framework for trusted pretained neural networks
- by 7wData
For humanity to achieve its unstated goal of building intelligence that will be far beyond us, I think it might be necessary to have a central repository of pretrained neural networks. Instead of having to reinvent the wheel for each task, we could stand on the shoulders of giants for our future tasks. It’s not as straightforward as it seems, however. There is considerable precedence in the open source world of freely using other people’s work to build on your own and there are even pretty solid mechanisms for including and distributing open source code. There are already many repositories that include weights for neural networks, so that is definitely a clear starting point.
Readers of Shallow Thoughts about Deep Learning are probably pretty knowledgable about neural network basics. The diagram in Figure 1 shows a basic neural network with the boldness of the edge representing the weight. In Transfer Learning, the basic idea is to use a neural network that’s trained to do one task to do another. The example Andrew Ng uses is actually a pretty great one, so let’s use that.
If you have a dataset of millions of cats (and non cats), you can use that to train a neural network to detect if a picture contains a cat or not. Pretty awesome, right? But if you wanted to extend the network identify pictures of YOUR cat, with the magic of transfer learning, you don’t need to create a whole new neural network. Instead, you can use the existing neural network, and its weights, and with small tweaks, have it identify only pictures of your cat. You probably don’t have millions of images of your cat, so it would be close to impossible to train a deep network to identify pictures of your cat with only a handful of pictures. But by adding a few layers at the end of the “is it a cat” network and training with pictures of YOUR cat, you can almost immediately have a great classifier that identifies pictures of your cat with a high degree of accuracy.
Since neural networks, especially deep ones, take a very long time to train with lots of data, it’s often impractical to start from scratch for every task.
The MNIST database, shown inFigure 3, is another example of a large dataset — over 60,000 images of handwritten digits — which has been solved down to an error rate of only 0.21%. If you wanted to create a classifier that identifies digits in Arabic numerals, then you don’t have to start from scratch! The neural networks that exist for identification in the MNIST dataset do a great job of segmenting a digit into its various components. By adding a layer or 2 and changing the output layer, you could easily have a highly accurate classifier that identifies Arabic numerals.
For all of you software developers out there, I’m sure you’ve used some sort of dependency manager like npm / bower for you Javascript folks or gems for the ruby crowd or even pip for the Pythonistas of the world. The purpose of these dependency / package managers is to quickly use code in your applications that other people wrote. In the node.js world, if you want to create a web server, rather than doing all of the work from scratch, you can use a package called express, which has a ton of functionality built in (caching, templating, etc). Over 200 contributors have committed code to making it a more complete and robust package, and in turn, have addressed bugs, added functionality to keep it modern and created more and more test suites to prove it does what it says it does. The same is true of most other packages out there. The thing about the express package is that it depends on other packages — and on the cycle goes.
But dependency management systems (or package management systems) are pretty mature now. They’re great at getting you the exact version of the code you need, reliably and complete with all of the dependencies every time. In fact, most package managers are so good that developers explicitly do not commit package source code to their repositories.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More