We've all dealt with activation functions while working with neural nets.
- Sigmoid
- Tanh
- ReLu & Leaky ReLu
- Gelu
Ever wondered why they are so importantโ๐ค
Let me explain it to you in this ๐งต๐

Before we proceed I want you to understand something!
You can think of a layer in neural net as a function & multiple layers makes the network a composite function.
Now, a composite function consisting of individual linear functions is also linear.
Check this ๐
You can think of a layer in neural net as a function & multiple layers makes the network a composite function.
Now, a composite function consisting of individual linear functions is also linear.
Check this ๐

We have a simple neural net that does binary classification.
Scenario 1:
- Linear decision boundary
- Linear Activation function
Observe how the neural net is able to quickly learn & loss converges to zero.
Watch this ๐
Scenario 1:
- Linear decision boundary
- Linear Activation function
Observe how the neural net is able to quickly learn & loss converges to zero.
Watch this ๐
VIDEO
Scenario 2:
- Non Linear decision boundary
- Linear Activation function
Observe how the neural net struggles to learn & the loss consistently remains high!
With linear activations it's unable to create a non-linear decision boundary.
Watch this ๐
- Non Linear decision boundary
- Linear Activation function
Observe how the neural net struggles to learn & the loss consistently remains high!
With linear activations it's unable to create a non-linear decision boundary.
Watch this ๐
VIDEO
Scenario 3:
- Non Linear decision boundary
- Non-linear Activation function (Sigmoid)
Observe how the neural net performs well this time.
With a non-linear activation function we give the network ability to create a non-linear decision boundary.
Watch this ๐
- Non Linear decision boundary
- Non-linear Activation function (Sigmoid)
Observe how the neural net performs well this time.
With a non-linear activation function we give the network ability to create a non-linear decision boundary.
Watch this ๐
VIDEO
Now we understand why activation functions are important.
Next time we see why do we need different flavours of these non-linear activation functions.
What are the advantages of one over other.
You can play around like i did in the videos here ๐
playground.tensorflow.org
Next time we see why do we need different flavours of these non-linear activation functions.
What are the advantages of one over other.
You can play around like i did in the videos here ๐
playground.tensorflow.org
That's a wrap!
If you interested in:
- Python ๐
- Data Science ๐
- Machine Learning ๐ค
- Maths for ML ๐งฎ
- MLOps ๐
- NLP ๐ฃ
- Computer Vision ๐ฅ
- LLMs ๐ง
I'm sharing daily content over here, follow me โ @akshay_pachaar if you haven't already!!
Cheers!! ๐
If you interested in:
- Python ๐
- Data Science ๐
- Machine Learning ๐ค
- Maths for ML ๐งฎ
- MLOps ๐
- NLP ๐ฃ
- Computer Vision ๐ฅ
- LLMs ๐ง
I'm sharing daily content over here, follow me โ @akshay_pachaar if you haven't already!!
Cheers!! ๐
Generated by Thread Navigator
Press โ + S to quick-export
