Neural networks are a popular topic in the Machine Learning community but to the rest of us, they’re relatively unknown and somewhat unapproachable. What is a neural network? And what are they used for? If you’ve had exposure to linear regressions, the same idea behind why we need such methodologies can be applied to why we need neural networks. Imagine we have a hypothesis; given a list of data points like number of rooms and perhaps the size of a house can we figure out the price of similar houses. The basic premise of that problem suggests that each variable (number of rooms, size of house) has a binary relationship with the output; mainly the output can either be negatively affected by the variable or the opposite can be true. With neural networks, the variables have a dynamic relationship with the output. Neural networks on the other hand, are used for problems that might not be so obvious. Perhaps we want to find out out if a students test scores can be predicted based on hours spend studying and hours of sleep the night before the exam. It would be difficult to model that problem linearly, but we can start to see why we might need a more advanced technique.
In this post, we’ll dive into implementing a very simple neural network to show that the foundations of a neural network are fairly simple to understand. As previously mentioned, I’m completing the Machine Learning course with Professional Andrew Ng and thus we’ll write in the language of Octave.
To start, I’d like to look at the below graph to see if we can identify any patterns.
If you’ll notice, when X₁ and X₂ are set to the same value (e.g. in the first and last row):
[0 0 1] [1 1 1]
This is what’s called an AND function where the bitwise operation performed on either numerical value produces
1. We’ll try to further expand what this means and how it contributes to building a neural network.