In this tutorial, we'll learn some of the mainly used activation function in neural networks like sigmoid, tanh, ReLU, and Leaky ReLU and their implementation with Keras in Python. The tutorial covers:
- Sigmoid function
- Tanh function
- ReLU (Rectified Linear Unit) function
- Leaky ReLU function
import numpy as np import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Activation, Dense, LeakyReLU
To check the performance of the activation function, we'll use x generated sequence data.
x = np.arange(-5, 5, 0.1) print(x[1:10])
[-4.9 -4.8 -4.7 -4.6 -4.5 -4.4 -4.3 -4.2 -4.1]
Sigmoid function
Sigmoid function transforms input value to the output between the range from 0 and 1. It is also called a logistic function and the curve of a function looks S-shaped. It is used in cases like making the final decision in the binary classification layer in a network.
Let's define the function in Python.
def sigmoid(x): return 1 / (1 + np.exp(-x))
Next, we'll draw the function in a plot.
y = [sigmoid(i) for i in x]
plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0.5, color="red", linewidth=.5) plt.plot(x, y) plt.show()
Sigmoid can be implemented in a Keras model as shown below. There are two ways to add the activation layer into the model, you can use any of them.
model = Sequential() ... model.add(keras.layers.Dense(1, activation="sigmoid"))
model.add(Activation('sigmoid'))
Tanh function
Tanh (Tangent Hyperbolic) function scales data to the range from -1 to 1 and centers the mean to 0. It is similar to sigmoid and the curve is S-shaped. We'll define the function in Python.
def tanh(x): return np.tanh(x)
And draw the function in a plot.
y = [tanh(i) for i in x]
plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0, color="red", linewidth=.5) plt.plot(x, y) plt.show()
Tanh can be implemented in a Keras model as shown below.
model = Sequential() ... model.add(Dense(10, activation="tanh"))
model.add(Activation('tanh'))
ReLU function
The ReLU stands for the Rectified Linear Unit and it is a commonly used activation function in neural networks these days. The function transforms to 0 if the input value is negative and keeps it if it is positive. The ReLU is cheaper and faster in computation than the sigmoid and tanh function.
We'll define the function in Python.
def relu(x): return np.maximum(0, x)
And draw the function in a plot.
y = [relu(i) for i in x] plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0, color="red", linewidth=.5) plt.plot(x, y) plt.show()
ReLU can be implemented in a Keras model as shown below.
model = Sequential() ...
model.add(Dense(10, activation="relu"))
model.add(Activation('relu'))
Leaky ReLU function
ReLU function converts all negative inputs to 0. The leaky ReLU prevents all negative neurons from dying by providing a small ingredient that allows some values less than 0. And this process improves the training capability of the network. We'll define the function in Python. Here, the alpha is used to multiply the input value.
def leakyrelu(x, alpha=0.01): if (x>0): return np.maximum(0, x) else: return x*alpha
And draw the function in a plot.
y = [leakyrelu(i, alpha=0.1) for i in x] plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0, color="red", linewidth=.5) plt.plot(x, y) plt.show()The Leaky ReLU can be implemented in a Keras model as shown below.
model = Sequential() ...
model.add(LeakyReLU(0.2))
Let's do some simulation with the above functions. Here, we'll generate sample input, weight, and bias parameters to check the output of each activation function above.
input = np.random.choice(range(-100, 100), 5)
weight = np.random.randn(5)/10 bias = np.random.randn(5)
actf = [sigmoid, tanh, relu, leakyrelu] for f in actf: print(f.__name__, " function:") for i,w,b in zip(input,weight,bias): print("%.1f" % i, " => ", f((i*w) + b))
sigmoid function: -11.0 => 0.8182085255905602 31.0 => 0.1681421926415923 44.0 => 0.8427583522921182 -58.0 => 0.60535247393778 -34.0 => 0.6713054653476185 tanh function: -11.0 => 0.905914553687992 31.0 => -0.9214954932658087 44.0 => 0.9327182035841355 -58.0 => 0.40349605295020774 -34.0 => 0.6132385416839997 relu function: -11.0 => 1.504256940169569 31.0 => 0.0 44.0 => 1.6788964853893629 -58.0 => 0.4278178624878862 -34.0 => 0.7140954189784394 leakyrelu function: -11.0 => 1.504256940169569 31.0 => -0.015988515154050215 44.0 => 1.6788964853893629 -58.0 => 0.4278178624878862 -34.0 => 0.7140954189784394
From the result above, we can check the output interval differences and their scaling range.
In multiple inputs, all inputs and weights are summed then bias will be added.
for f in actf: print(f.__name__, f(np.dot(input, weight)+bias[1]))
sigmoid 0.564087924191638 tanh 0.2522081568246547 relu 0.2577695712819361 leakyrelu 0.2577695712819361
In this tutorial, we've briefly learned activation functions and their implementation in the Keras model.
Source code listing
import numpy as np import matplotlib.pyplot as plt def sigmoid(x): return 1 / (1 + np.exp(-x)) def tanh(x): return np.tanh(x) def relu(x): return np.maximum(0, x) def leakyrelu(x, alpha=0.01): if (x>0): return np.maximum(0, x) else: return x*alpha x = np.arange(-5, 5, 0.1) print(x[1:10])
y = [sigmoid(i) for i in x] plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0.5, color="red", linewidth=.5) plt.plot(x, y) plt.show() y= [tanh(i) for i in x] plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0, color="red", linewidth=.5) plt.plot(x, y) plt.show() y= [relu(i) for i in x] plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0, color="red", linewidth=.5) plt.plot(x, y) plt.show() y= [leakyrelu(i, alpha=0.1) for i in x] plt.axvline(x=0, color="red", linewidth=.5) plt.axhline(y=0, color="red", linewidth=.5) plt.plot(x, y) plt.show() weight=np.random.randn(5)/10 bias=np.random.randn(5) input=np.random.choice(range(-100,100), 5)
actf = [sigmoid, tanh, relu, leakyrelu] for f in actf: print(f.__name__, " function:") for i,w,b in zip(input,weight,bias): print("%.1f" % i, " => ", f((i*w) + b)) for f in actf: print(f.__name__, f(np.dot(input, weight)+bias[1]))
References:
No comments:
Post a Comment