Activation Functions

All activation function are used to introduce some non-linearity in the network, but depending on the network, some function are able to mitigate some other problems and make the learning process more stable.

There are a lot of activation functions, but the most popular are the following:

Sigmoid Function
Tanh Function
ReLU Function
Maxout Function
Softmax Function

Note

The rule of thumb is to use ReLU and see how it goes, then try to use one of its variants (including Maxout) to squeeze out some marginal gains. Don’t use sigmoid or tanh!

tags:#ai/deep-learning

👨🏽‍💻 Domiziano's Notes

Explorer

Activation Functions

Graph View

Backlinks