Towards Data Science

Nov 4, 2020

## How Neural Networks Solve the XOR Problem

And why hidden layers are so important.

## Table of Contents

A perceptron has the following components:

## Input Nodes

## Weights and Biases

The output calculation is straightforward.

This can be expressed like so:

## Activation Function

## Classification

How does a perceptron assign a class to a datapoint?

## Training algorithm

We start the training algorithm by calculating the gradient , or Δw. Its the product of:

- the value of the input node corresponding to that weight
- The difference between the actual value and the computed value.

## The 2D XOR problem

In the XOR problem, we are trying to train a model to mimic a 2D XOR function.

## The XOR function

The function is defined like so:

## Attempt #1: The Single Layer Perceptron

Let's model the problem using a single layer perceptron.

The data we’ll train our model on is the table we saw for the XOR function.

## Implementation

If not, we reset our counter, update our weights and continue the algorithm.

Let’s create a perceptron object and train it on the XOR data.

A perceptron can only converge on linearly separable data. Therefore, it isn’t capable of imitating the XOR function.

## The Need for Non-Linearity

## The 2d XOR problem — Attempt #2

We know that the imitating the XOR function would require a non-linear decision boundary.

But why do we have to stick with a single decision boundary?

## The Intuition

Let’s first break down the XOR function into its AND and OR counterparts.

The XOR function on two boolean variables A and B is defined as:

Let’s add A.~A and B.~B to the equation. Since they both equate to 0, the equation remains valid.

Let’s rearrange the terms so that we can pull out A from the first part and B from the second.

Simplifying it further, we get:

Let’s call the OR section of the formula part I, and the NAND section as part II.

## Modelling the OR part

We’ll use the same Perceptron class as before, only that we’ll train it on OR training data.

correct_counter measures the number of consecutive datapoints correctly classified by our Perceptron

The decision boundary plot looks like this:

## Modelling the NAND part

## Bringing everything together

Two things are clear from this:

- we are performing a logical AND on the outputs of two logic gates (where the first one is an OR and the second one a NAND)
- and that both functions are being passed the same input (x1 and x2).

Let’s model this into our network. First, let’s consider our two perceptrons as black boxes.

After adding our input nodes x_1 and x_2, we can finally implement this through a simple function.

Finally, we need an AND gate, which we’ll train just we have been.

What we now have is a model that mimics the XOR function.

If we were to implement our XOR model, it would look something like this:

Out of all the 2 input logic gates, the XOR and XNOR gates are the only ones that are not linearly-separable.

A potential decision boundary could be something like this:

## The Multi-layered Perceptron

The biggest difference? An MLP can have hidden layers.

## Hidden layers

Hidden layers are those layers with nodes other than the input and output nodes.

An MLP is generally restricted to having a single hidden layer.

Activation functions should be differentiable, so that a network’s parameters can be updated using backpropagation.

Backpropagation is an algorithm for update the weights and biases of a model based on their gradients with respect to the error function, starting from the output layer all the way to the first layer.

The method of updating weights directly follows from derivation and the chain rule.

## Understanding Backpropagation Algorithm

Learn the nuts and bolts of a neural network’s most important ingredient.

## Attempt #3: the Multi-layered Perceptron

The architecture of a network refers to its general structure — the number of hidden layers, the number of nodes in each layer and how these nodes are inter-connected.

The libraries used here like NumPy and pyplot are the same as those used in the Perceptron class.

The sigmoid activation function

Its derivate its also implemented through the _delsigmoid function.

Let’s train our MLP with a learning rate of 0.2 over 5000 epochs.

A clear non-linear decision boundary is created here with our generalized neural network, or MLP.

## Note #1: Adding more layers or nodes

## Tensorflow - Neural Network Playground

## Note #2: Choosing a loss function

## How to Choose Loss Functions When Training Deep Learning Neural Networks - Machine Learning Mastery

You’ll find the entire code from this post here.

## Polaris000/BlogCode/xorperceptron.ipynb

The sample code from this post can be found here..

## More from Towards Data Science

Your home for data science. A Medium publication sharing concepts, ideas and codes.

## Get the Medium app

## Aniruddha Karajgi

## DEV Community

## Demystifying the XOR problem

Let's explore what is this XOR problem...

## The XOR Problem

## Perceptrons

## Multilayer Perceptrons

## Backpropagation

## Top comments (2)

Templates let you quickly answer FAQs or store snippets for re-use.

For further actions, you may consider blocking this person and/or reporting abuse

## 50 CLI Tools You Can't Live Without

## 3D Card Animation Using HTML CSS & JavaScript | CSS Animation | Slide Card | Rotate Animation

## How I used GPT-3 to Build 1,000 AWS Quiz Questions

## 7 Free AI Website Tools For Everyone and Anyone

## Ngx-Translate: Internationalization (i18n) library for Angular

Once unsuspended, jbahire will be able to comment and publish posts again.

Once unpublished, all posts by jbahire will become hidden and only accessible to themselves.

If jbahire is not suspended, they can still re-publish their posts from their dashboard.

They can still re-publish the post if they are not suspended.

Thanks for keeping DEV Community safe. Here is what you can do to flag jbahire:

Unflagging jbahire will restore default visibility to their posts.

We're a place where coders share, stay up-to-date and grow their careers.

## XOR problem with neural networks: An explanation for beginners

## Table of Contents

Let us try to understand the XOR operating logic using a truth table.

## Sign up for your weekly dose of what's up in emerging technology.

## Download our Mobile App

Are you looking for a complete repository of Python libraries used in data science, check out here .

The linear separable data points appear to be as shown below.

## Need for linear separability in neural networks

Example : For X1=0 and X2=0 we should get an input of 0. Let us solve it.

Solution: Considering X1=0 and X2=0

## More Great AIM Stories

## Our Upcoming Conferences

16-17th Mar, 2023 | Bangalore Rising 2023 | Women in Tech Conference

27-28th Apr, 2023 I Bangalore Data Engineering Summit (DES) 2023 27-28th Apr, 2023

23 Jun, 2023 | Bangalore MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York MachineCon USA 2023 [AI100 Awards]

## 3 Ways to Join our Community

Discover special offers, top stories, upcoming events, and more.

## Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

## Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox, aim top stories.

## Google claims that the Carbon footprint of ML training is reducing. Is it really?

## Bengio & LeCun debate on how to crack human-level AI

## What pisses off data scientists the most

## Indian IT sector’s attrition rate on the rise, no sign of respite

Attrition in the Indian IT sector is at an all-time high. For the quarter

## Explainable image classification using Faster R-CNN and Grad-Cam

## How this Gurugram based AI-startup is revolutionising English teaching for kids

## What are Github’s Codespace prebuilds

## Hands-on oneAPI workshop: Getting started with Intel® Optimisation for PyTorch*

## Why should you attend The Rising 2022

## This Jaipur based startup is using AI & IoT for Tyre Pressure Monitoring System

## Our mission is to bring about better-informed and more conscious decisions about technology through authoritative, influential, and trustworthy journalism.

© Analytics India Magazine Pvt Ltd & AIM Media House LLC 2023

## Solving the XOR problem

## Logical XOR

## Defining a Neural Network

## Hidden Unit 1

Hidden unit 2, output unit, calculations.

Let's check if we really get the outputs of the XOR-problem with these formulas.

The first case has the inputs x 1 = 0 and x 2 = 0 and the output should be y = 0 .

The first case has the inputs x 1 = 0 and x 2 = 1 and the output should be y = 1 .

The first case has the inputs x 1 = 1 and x 2 = 0 and the output should be y = 1 .

The first case has the inputs x 1 = 1 and x 2 = 1 and the output should be y = 0 .

As you can see, the Neural Network generates the desired outputs.

## Solving the XOR problem using MLP

Exclusive or is a logical operation that outputs true when the inputs differ.

For the XOR gate, the TRUTH table will be as follows

To separate the two outputs using linear equation(s), we would need to draw two separate lines like

## What is the XOR problem?

Let’s call the output to be Y, so

Y = A1X1 + A2X2 + A3X3 + …. + B

Y can also be called the weighted sum.

## How is the XOR problem solved?

Each hidden unit invokes an activation function, to range down their output values to 0 or 1.

## IMAGES

## VIDEO

## COMMENTS

Evalutation Training algorithm2d Xor problem. The XOR functionAttempt #1: The Single Layer Perceptron Implementing the Perceptron algorithm

The solution to this problem is to expand beyond the single-layer architecture by adding an additional layer of units without any direct access

The XOR problem with neural networks can be solved by using Multi-Layer Perceptrons or a neural network architecture with an input layer, hidden

The XOr problem is that we need to build a Neural Network (a perceptron in our case) to produce the truth table related to the XOr logical operator.

These tasks can be solved by a simple Perceptron. XOR stands for 'exclusive or'. The output of the XOR function has only a true value if the two

6. Implement XOR function using McCulloch–Pitts neuron Soft Computing Machine Learning Mahesh Huddar · 155 - How many hidden layers and neurons

Neural Networks 2 XOR. 19K views 1 year ago Week 5: ... 02 Intro to Deep Learning Part 3: Multilayer Perceptron & XOR Problem. Dr Anne Hsu.

Shows how a neural network with one hidden layer can solve the XOR problem (whereas a network without hidden layer cannot)

In our X-OR problem, output is either 0 or 1 for each input sample. So, it is a two class or binary classification problem. We will use binary

The solution to the XOR problem lies in multidimensional analysis. We plug in numerous inputs in various layers of interpretation and processing, to generate