What is a Neural Network?
In a nutshell, the human brain is little more than a collection of billions and billions of tiny neurons, connected by synapses, that collectively work to make decisions. A neuron will review electrical information from our senses and route data to another neuron, which then will review and route to another, and then another, and another, until eventually a chain of activated neurons can represent a decision. Each neuron processes data independently, yet is still dependent on the rest of the network to get any new data or do anything useful. This bundle of decision-making neurons, loosely connected, is what creates us.
An artificial neural network attempts to mimic our brain with machines.
Not perfectly mimic, of course; inside an artificial neural network (ANN), you will find little processors (originally called perceptrons by Frank Rosenblatt, all the way back in the 1950’s) that take in data, perform simple calculations, and then will produce one or more outputs. Those outputs are routed to the next layer of processors, called the hidden layer, along with inputs from other processors in order to produce the final decision.
To better understand, consider a simplified example:
You need to make the following decision: should I go on a vacation to Paris?
In order to make this decision, you have to consider the answers to a number of important questions, each of which have a binary output value (think “yes” or “no”).
Can I fly in the morning?
Can I afford this trip without debt?
Is there a nice hotel in my price range?
Is it going to be warm when I arrive?
Before the neural network can begin to interpret these inputs, though, we need to assign a weight to each question. Weighting is how we determine the importance of each answer as a metric for making decisions. In other words, how much do you really care if you can fly in the morning? Is that very important, a total deal-breaker, or just a minor concern?
You can imagine that you would probably be willing to take the vacation even if the only flight in your price range departs in the afternoon, but maybe you really don’t want to travel in the winter. You can assign a bigger weight to whether or not it will be warm (because this is more important to you), and a smaller weight to whether or not the flight is in the morning (because this is less important). Once we have weights for all questions, we can calculate scores for different combinations of answers.
To demonstrate this, let’s imagine we assign the following weights:
Can I fly in the morning? 4
Can I afford this trip without debt? 9
Is there a nice hotel in my price range? 6
Is it going to be warm when I arrive? 8
If you find that you can fly in the morning and stay at a nice hotel in your price range, but will be arriving when it is not warm and will need some debt to pay for the vacation, then the weighted score for this scenario will be 10. By contrast, if you find that you can’t fly in the morning and can’t stay at a nice hotel in your price range, but will be arriving in warm weather and won’t need any debt to pay for the vacation, the score for that scenario will be 17.
Now, we need to pick a decision threshold.
If we set a threshold of, say, 12, then you would go to Paris in the second scenario but would not in the first scenario. Does this mean that warmth and avoiding debt is more important than the time of departure and the rating of the hotel? What if I had another set of questions for the same scenario that correlated to whether I wanted to see the Eiffel Tower, or whether I could fly on United Airlines? What if I had hundreds of these questions? Can we predict the likelihood of a decision by simply comparing different combinations of answers to these questions?
To make predictions, we need to create a large amount of example situations, called training data, that we feed into the neural network. Each processor performs a calculation with different sets of data and each will generate an answer based on what it thinks to be true. The next layer will then review outputs from those processors at the prior layer and make a determination. When the neural network cycles through each layer, it will produce an answer as to whether or not we should go on vacation based on the output of each individual processor.
It is important to note that this is a greatly simplified analogy that doesn’t give the full picture. Furthermore, this analogy describes only one type of neural network: feedforward. There are other types of neural networks, including one that is called a recurrent neural network involving feeback loops. RNN’s are particularly good for language modeling. And we can’t forget the ever-popular paradigm of deep neural networks (often referred to as “deep learning”), which is a name for any neural network that has more than one hidden layer of processors.
Neural networks are nothing new. As mentioned before, scientists have been working on this technology since the middle of the last century. But they have recently become extremely popular due to the increasing affordability and accessibility of large collections of hardware. Additionally, toolkits have begun to proliferate, such as the now-famous TensorFlow, making it fast and easy to experiment with neural networks. This technology will soon dominate the machine learning landscape and bring with it impressive gains in model accuracy and depth. Neural networks will power the next generation of thinking machines, and help deliver autonomous vehicles, errorless linguistics, and flawless machine vision.
Perhaps one day, neural networks will even be able to deliver a system capable of general intelligence on par with that exhibited by the human brain.