Hello, everyone welcome to the semicolon. In this tutorial, we’re going to learn about activation functions So according to Wikipedia activation functions are something which maps a particular output to a particular set of inputs. So they are used for containing the output in between zero to one or any given values They are also used to impart a non-linearity and they are one of the important factors which affect your results and accuracy of your model, so it is very important that we learn about it. So these are some famous activation functions Identity activation function, Binary Step activation function Logistical or sigmoid which we had seen in the last tutorial. The Tanh activation function, ArcTan ReLU, Leaky ReLU and SoftMax. So we’ll be looking at what each of this does So identity function is as simple as this, if you have x as your input, it gives you x. There’s nothing big in it. So this is the graph of it and whatever your curve is you will get the exact same curve? Now binary Step function if your input is greater than 0 it gives you 1 and if your input is less than 0 it gives you 0. So it takes all the positive input makes it 1 and all the negative inputs and make it 0. So This is the binary step function. It is very useful in classifiers when you want to classify it between 1 and 0, then this is very useful. Then we have a logistic or sigmoid and what this does is whatever your input is it maps it between 0 to 1 so even if your input is as large as thousands or lakhs or millions It will map it between 0 to 1 so that is very useful in neural networks because when the input goes out of 0 it may start to increase exponentially, which may be a problem, so it contains the input between 0 to 1 which is very useful. Now we have Tanh activation function which contains the function from minus 1 to 1 and is similarly useful to the sigmoid function you can try out between Tanh and sigmoid and check which accuracy is better and use it accordingly. So this is the Tanh activation function 2 by 1 plus e power minus 2x minus 1. And this is the ArchTan function this is just tan inverse of x. So even this contains the number between somewhere around minus Pi by 2 to plus Pi by 2. So even this is a kind of replaceable and alternative to sigmoid or Tanh. It does a similar task. Now we have ReLU, which is a very popular one when it comes to deep learning and even normal Neural Networks. So what it does is, whenever your function is less than 0 it gives you 0 and whenever it is greater than 0 it remains as it is. So what it does is, it removes the negative part of your function and now we have leaky ReLU, which does a similar job, but the it doesn’t make the negative input 0 just reduces the magnitude of it. And then, we have our SoftMax classifier. So this is used to impart probabilities. When you have 4 or 5 outputs and you pass it through this you get the probability distribution of each and this is useful for finding out the most probable Occurrence or the classification where the probability of a class is maximum. So these are some famous activation functions which are very useful and play an important role in deciding the accuracy of your model. Whenever you have a doubt that, this activation function might be better it’s always better that you use it and test the accuracy, because you never know which activation functions will give you the best accuracy. So that’s it for now guys, if this tutorial helped you hit the like button. Click on the subscribe button if you want to keep watching Thank you.

Good, keep posting these videos.

Very helpful! Thanks a lot sir!

But I still don't get the different uses of each activation functions. When is one function more preferred? Does it depend on the application? And how did they chose this functions? What's the logic behind using these functions as activation functions? But great video!

sigmoid is only giving me numbers between 0.5 and 1.0

im doing f(x) = 1 / (1 + EulersNumber^-x)

am I doing something wrong?

Thank you so much for making this video. This is the most clear (and concise) video I have found on the topic!

thank You

very helpful ! But stil i am not able to get the exact mathematical intuition behind the fact "activation function introducing non-linearities to the network"

Short and clear coverage of the principale activation functions, thanks, very helpful!

these videos are so good , I dont want other people to see them !

Thank you so much brother !!

Can you provide an example of softmax function?

Nice. Can you show us how to use it.

Hi, I am confused. ReLU will kill the neuron only during the forward pass? Or also during the backward pass?

Thank you very much for the fastest video along with all the useful formulae. Indeed helpful!

you are fron India

thanks semicolon, for succinct definitions, but it would have been great to tell what activation function to use when and why we would use sigmoid instead of tanh let's say…

neat and clean , great (y)

nice！

Thank you!

Thank you! Great video, easy to understand!

Paper tomorrow. Needed this!

You lost me at Wikipedia.

Awesome explanation and I have small doubt that is which activation function is differentiable to for all real values of the given input.

Very concise explanation.

Thank you!

Excellent Video!