A Very Short History of Artificial Neural Networks

“If you want to understand a really complicated device, like a brain, you should build one.” G Hinton, 2018.
The development of modern artificial neural networks occurred in a series of step-wise dramatic improvements, interspersed with periods of stasis (often called neural network winters). As early as 1943, before commercial computers existed, McCulloch and Pitts began to explore how small networks of artificial neurons could mimic brain-like processes. The cross-disciplinary nature of this early research is evident from the fact that McCulloch and Pitts also contributed to the classic neurophysiological paper What the frog’s eye tells the frog’s brain (Lettvin et al, 1959); remarkably, this paper was not published in a biology journal, but in the Proceedings of the Institute of Radio Engineers.
In subsequent years, the increasing availability of computers allowed neural networks to be tested on simple pattern recognition tasks. But progress was slow, partly because most research funding was allocated to more conventional approaches that did not attempt to mimic the neural architecture of the brain, and partly because early artificial neural network learning algorithms were limited. A taxonomy of artificial neural networks is shown in Figure 1.
A major landmark in the history of neural networks was Frank Rosenblatt’s perceptron in 1958, which was explicitly modelled on the neuronal structures in the brain. The perceptron learning algorithm allowed a perceptron to learn to associate inputs with outputs in a manner apparently similar to learning in humans. Specifically, the perceptron could learn an association between a simple input ‘image’ and a desired output, where this output indicated whether or not the object was present in the image. A simple perceptron is shown in Figure 2.
Perceptrons, and neural networks in general, share three key properties with human memory. First, unlike conventional computer memory, neural network memory is content addressable, which means that recall is triggered by an image or a sound, as shown in Figure 3. In contrast, computer memory can be accessed only if the specific location (address) of the required information is known.
Second, a common side-effect of content addressable memory is that, given a learned association between an input image and an output, recall can be triggered by an input image that is similar (but not identical) to the original input of a particular input/output association. This ability to generalise beyond learned associations is a critical human-like property of artificial neural networks.
Third, if a single weight or unit is destroyed, this does not eliminate any particular association; instead, it usually degrades all associations to some extent. This graceful degradation is thought to resemble human memory.
Despite these human-like qualities, the perceptron was dealt a severe blow in 1969, when Minsky and Papert famously proved that it could not learn associations unless they were of a particularly simple kind (i.e. linearly separable, as described in Chapter 2 of Artificial Intelligence Engines). This marked the beginning of the first neural network winter, during which neural network research was undertaken by only a handful of scientists. During this winter, the capabilities of linear networks were explored in the forms of holophones by Longuet-Higgins (1968), correlographs by Longuet-Higgins, Willshaw, and Buneman (1970), and correlation matrix memories by Kohonen (1972).
“The ability of large collections of neurons to perform “computational” tasks may in part be a spontaneous collective consequence of having a large number of interacting simple neurons”. JJ Hopfield, 1982.
The modern era of neural networks began in 1982 with the Hopfield net, shown in Figure 4.


