Have you ever wondered how Facebook recognizes your face in a group photo? Or how Siri is able to understand your commands and give you answers accordingly? All of these are possible because of the amazing technology of artificial neural networks (ANNs). ANNs are part of the exciting field of machine learning, and they have been employed in numerous fascinating applications. In this tutorial, we will walk you through how to build a neural network in Python from scratch. This tutorial is suitable for beginners who are interested in learning about artificial neural networks or want to explore the inner workings of machine learning algorithms.
What is a Neural Network?
A neural network is a complex set of algorithms that are designed to recognize patterns. It’s inspired by the structure of the human brain, where neurons are connected to one another to analyze information. A neural network is composed of layers, and each layer is made up of neurons. These neurons process information and pass it on to the next layer. The first layer is called the input layer, while the last layer is called the output layer. In between, there can be one or more hidden layers.
Why Python?
Python is one of the most popular languages used for machine learning, and it’s known for its easy-to-learn syntax. Python has a large community of developers who are constantly contributing to machine learning libraries. These libraries make it possible to build complex machine learning models with minimal coding. Python makes it easier to debug and test your code, as it provides an interactive interpreter. Python also supports functional and object-oriented programming paradigms, which makes it very flexible.
Building a Neural Network in Python
To build a neural network in Python, we first need to define its architecture. We need to decide how many layers our network will have and how many neurons each layer will contain. We also need to choose an activation function. In this tutorial, we will build a simple neural network with one hidden layer.
We will start by importing the necessary libraries:
import numpy as np import matplotlib.pyplot as plt
Next, we need to define our activation function. We will be using the sigmoid activation function, which is defined as follows:
def sigmoid(x): return 1 / (1 + np.exp(-x))
The sigmoid function takes in an input value and returns a value between 0 and 1. This function will be applied to each neuron in our network.
Now, we will define our neural network architecture. We will create three layers: an input layer, a hidden layer, and an output layer. Our input layer will have two neurons, our hidden layer will have four neurons, and our output layer will have one neuron.
class NeuralNetwork: def __init__(self): # Define the architecture of the network self.input_layer = 2 self.hidden_layer = 4 self.output_layer = 1
We will initialize the weights and biases for each layer. We will use random values for our weights and biases.
# Initialize weights and biases for each layer self.weights1 = np.random.randn(self.input_layer, self.hidden_layer) self.bias1 = np.random.randn(self.hidden_layer) self.weights2 = np.random.randn(self.hidden_layer, self.output_layer) self.bias2 = np.random.randn(self.output_layer)
We will also define the forward propagation function, which takes in an input and outputs a prediction by going through each layer of the network.
def forward_propagation(self, X): # Pass the input through the first layer self.layer1 = sigmoid(np.dot(X, self.weights1) + self.bias1) # Pass the output of the first layer through the second layer self.layer2 = sigmoid(np.dot(self.layer1, self.weights2) + self.bias2) return self.layer2
Finally, we will define the training function, which updates the weights and biases to minimize the error between our predicted output and the actual output.
def train(self, X, y, epochs=1000, learning_rate=0.1): # Iterate over the number of epochs for epoch in range(epochs): # Perform forward propagation output = self.forward_propagation(X) # Calculate the error error = y - output # Calculate the gradients of the weights and biases grad_weights2 = np.dot(self.layer1.T, (2 * error * sigmoid(output, derivative=True))) grad_bias2 = 2 * error * sigmoid(output, derivative=True) grad_weights1 = np.dot(X.T, (np.dot(2 * error * sigmoid(output, derivative=True), self.weights2.T) * sigmoid(self.layer1, derivative=True))) grad_bias1 = np.dot(2 * error * sigmoid(output, derivative=True), self.weights2.T) * sigmoid(self.layer1, derivative=True) # Update the weights and biases self.weights2 += learning_rate * grad_weights2 self.bias2 += learning_rate * grad_bias2.sum(axis=0) self.weights1 += learning_rate * grad_weights1 self.bias1 += learning_rate * grad_bias1.sum(axis=0)
Now that we have defined our neural network, let’s test it out. We will be using the XOR gate as an example. The XOR gate is a logical gate that outputs 1 only when its two inputs are different.
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]) y = np.array([0, 1, 1, 0]) nn = NeuralNetwork() nn.train(X, y) print(nn.forward_propagation(np.array([0, 0]))) # Output should be close to 0 print(nn.forward_propagation(np.array([0, 1]))) # Output should be close to 1 print(nn.forward_propagation(np.array([1, 0]))) # Output should be close to 1 print(nn.forward_propagation(np.array([1, 1]))) # Output should be close to 0
These four lines of code test our network on all possible inputs of the XOR gate. Each output should be close to 0 or 1, depending on the inputs.
In this tutorial, we have explained what a neural network is and why Python is a great programming language for machine learning. We have also shown you how to build a simple neural network from scratch in Python, using the XOR gate as an example. Now that you have learned the basics of neural networks, you can go on to discover their many applications in the real world.
Want to learn more about Python, checkout the Python Official Documentation for detail.