Deep Learning Chapter 1 Introduction

The many Names and Changing Fortunes of Neural Networks

The earliest predecessors of modern deep learning were simple linear models motivated from a neuroscientific perspective. The first wave of neural networks research was known as cybernetics.

The McCulloch-Pitts Neuron was an early model of brain function. The linear model could recognize two different categories of inputs by testing whether output is positive or negative. The weights needed to be set correctly. In 1950s, the perceptron became the first model that could learn the weights defining the categories given examples of inputs from each category. The adaptive linear element simply returned the value of f(x) itself to predict a real number, and could also learn to predict these numbers from data. The training algorithm used o adapt the weights of the ADADLINE was a special case of an algorithm called stochastic gradient descent.

Models based on the f(x,w) used by the perceptron and ADALINE are called linear models. Linear models have many limitations. They cannot learn the XOR function. Today, neuroscience is regarded as an important source of inspiration for deep learning researchers, but it is no longer the predominant guide for the field.

Neuroscience has given us a reason to hope that a single deep learning algorithm can solve many different tasks.

In the 1980s, the second wave of neural network research emerged in great part via a movement called connectionism or parallel distributed processing. Connectionism arose in the context of cognitive science. The central idea in connectionism is that a large number of simple computational units can achieve intelligent behavior when networked together.

Several key concepts arose during the connectionism movement of the 1980s that remain central to today's deep learning. One of these concepts is that of distributed representation. This is the idea that each input to a system should be represented by many features, and each feature should be involved in the representation of many possible inputs. Another major accomplishment of the connectionist movement was the successful use of back-propagation to train deep neural networks with internal representations and the popularization of the back-propagation algorithm.