The encoder compresses the data into the middle layer that is a latent vector. The decoder decompresses the data from the latent vector. The latent vector represents a compressed version of the original data that the decoder can restore new data from it. The autoencoder orchestrates to train both encoder and decoder. In this tutorial, we'll briefly learn how to build a simple autoencoder with Keras API in R. The tutorial covers:
- Preparing the data
- Defining Encoder and Decoder
- Defining Autoencoder
- Generating with Autoencoder
- Source code listing
library(keras)
Preparing the data
We'll use MNIST handwritten digit dataset to train the autoencoder. First, we'll load it and prepare it by doing some changes. Autoencoder requires only input data so that we only focus on x part of the dataset. We'll scale it into the range of [0, 1].
c(c(xtrain, ytrain), c(xtest, ytest)) %<-% dataset_mnist() xtrain = xtrain/255 xtest = xtest/255
Next, we'll define the input data size that comes from image dimension and latent vector size to use in a model later.
input_size = dim(xtrain)[2]*dim(xtrain)[3] latent_size = 10 print(input_size) [1] 784
We'll change the shape of input data by using the input size above.
x_train = array_reshape(xtrain, dim=c(dim(xtrain)[1], input_size)) x_test = array_reshape(xtest, dim=c(dim(xtest)[1], input_size))
print(dim(x_train)) [1] 60000 784
print(dim(x_test)) [1] 10000 784
Defining Encoder and Decoder
We'll define the encoder starting from the input layer. Encoder contains the Dense layer and ReLU leaky activations. The last output layer defines the latent vector size.
enc_input = layer_input(shape = input_size) enc_output = enc_input %>% layer_dense(units=256, activation = "relu") %>% layer_activation_leaky_relu() %>% layer_dense(units=latent_size) %>% layer_activation_leaky_relu() encoder = keras_model(enc_input, enc_output) summary(encoder) ________________________________________________________________________________ Layer (type) Output Shape Param # ================================================================================ input_11 (InputLayer) (None, 784) 0 ________________________________________________________________________________ dense_15 (Dense) (None, 256) 200960 ________________________________________________________________________________ leaky_re_lu_15 (LeakyReLU) (None, 256) 0 ________________________________________________________________________________ dense_16 (Dense) (None, 10) 2570 ________________________________________________________________________________ leaky_re_lu_16 (LeakyReLU) (None, 10) 0 ================================================================================ Total params: 203,530 Trainable params: 203,530 Non-trainable params: 0 ________________________________________________________________________________
Next, we'll define the decoder starting from the input layer. The decoder also contains a similar Dense layer and ReLU leaky activations. The last output layer defines the input size and Sigmoid activation.
dec_input = layer_input(shape = latent_size) dec_output = dec_input %>% layer_dense(units=256, activation = "relu") %>% layer_activation_leaky_relu() %>% layer_dense(units = input_size, activation = "sigmoid") %>% layer_activation_leaky_relu() decoder = keras_model(dec_input, dec_output)
summary(decoder) ________________________________________________________________________________ Layer (type) Output Shape Param # ================================================================================ input_10 (InputLayer) (None, 10) 0 ________________________________________________________________________________ dense_13 (Dense) (None, 256) 2816 ________________________________________________________________________________ leaky_re_lu_13 (LeakyReLU) (None, 256) 0 ________________________________________________________________________________ dense_14 (Dense) (None, 784) 201488 ________________________________________________________________________________ leaky_re_lu_14 (LeakyReLU) (None, 784) 0 ================================================================================ Total params: 204,304 Trainable params: 204,304 Non-trainable params: 0 ________________________________________________________________________________
Defining Autoencoder
Finally, we'll define the autoencoder. It is a combination of encoder and decoder with an additional input layer.
aen_input = layer_input(shape = input_size) aen_output = aen_input %>% encoder() %>% decoder() aen = keras_model(aen_input, aen_output) summary(aen) ________________________________________________________________________________ Layer (type) Output Shape Param # ================================================================================ input_12 (InputLayer) (None, 784) 0 ________________________________________________________________________________ model_11 (Model) (None, 10) 203530 ________________________________________________________________________________ model_10 (Model) (None, 784) 204304 ================================================================================ Total params: 407,834 Trainable params: 407,834 Non-trainable params: 0 ________________________________________________________________________________
Here you can see that the vector with the size of 784 is shrinking to the 10 and returning back to the 784 again. We'll add the Rmsprop optimizer to train the model and train it on x training data.
aen %>% compile(optimizer="rmsprop", loss="binary_crossentropy") aen %>% fit(x_train,x_train, epochs=50, batch_size=256)
Generating with Autoencoder
After the training, we can generate x test data with encoder and decoder. First, we'll encode then decode the data.
encoded_imgs = encoder %>% predict(x_test) decoded_imgs = decoder %>% predict(encoded_imgs)
We need to change the shape of the decoded data.
pred_images = array_reshape(decoded_imgs, dim=c(dim(decoded_imgs)[1], 28, 28))
Finally, we'll visualize the original x test images and the reconstructed images in a plot.
n = 10 op = par(mfrow=c(12,2), mar=c(1,0,0,0)) for (i in 1:n) { plot(as.raster(pred_images[i,,])) plot(as.raster(xtest[i,,])) }
In this plot, the digits on the left side are the reconstructed images and the digits on the right side are the original digits. You can alter the output order by changing the plot settings.
In this tutorial, we've briefly learned how to build a simple autoencoder with Keras in R. The full source code is listed below.
Source code listing
library(keras)
library(caret)
c(c(xtrain, ytrain), c(xtest, ytest)) %<-% dataset_mnist() xtrain = xtrain/255 xtest = xtest/255
input_size = dim(xtrain)[2]*dim(xtrain)[3]
latent_size = 10
print(input_size)
x_train = array_reshape(xtrain, dim=c(dim(xtrain)[1], input_size))
x_test = array_reshape(xtest, dim=c(dim(xtest)[1], input_size))
print(dim(x_train))
print(dim(x_test))
enc_input = layer_input(shape = input_size)
enc_output = enc_input %>%
layer_dense(units=256, activation = "relu") %>%
layer_activation_leaky_relu() %>%
layer_dense(units=latent_size) %>%
layer_activation_leaky_relu()
encoder = keras_model(enc_input, enc_output)
summary(encoder)
dec_input = layer_input(shape = latent_size)
dec_output = dec_input %>%
layer_dense(units=256, activation = "relu") %>%
layer_activation_leaky_relu() %>%
layer_dense(units = input_size, activation = "sigmoid") %>%
layer_activation_leaky_relu()
decoder = keras_model(dec_input, dec_output)
summary(decoder)
aen_input = layer_input(shape = input_size)
aen_output = aen_input %>%
encoder() %>%
decoder()
aen = keras_model(aen_input, aen_output)
summary(aen)
aen %>% compile(optimizer="rmsprop", loss="binary_crossentropy")
aen %>% fit(x_train,x_train, epochs=50, batch_size=256)
encoded_imgs = encoder %>% predict(x_test)
decoded_imgs = decoder %>% predict(encoded_imgs)
pred_images = array_reshape(decoded_imgs, dim=c(dim(decoded_imgs)[1], 28, 28))
n = 10
op = par(mfrow=c(12,2), mar=c(1,0,0,0))
for (i in 1:n)
{
plot(as.raster(pred_images[i,,]))
plot(as.raster(xtest[i,,]))
}
Reference:
Building autoencoders in Keras
layer_dense(units=256, activation = "relu") %>% layer_activation_leaky_relu()
ReplyDeleteI think activation = "relu" should be removed when there follows a leaky relu layer.
Otherwise the leakly relu layer can only get input >= 0. Pardon me if I am wrong