How to Build Simple Autoencoder with Keras in R

   Autoencoder learns to compress the given data and reconstructs the output according to the data trained on. It can only represent a data specific and lossy version of the trained data. Autoencoder consists of three parts;  encoder, decoder, and autoencoder. All those parts are based on the neural network layers.
   The encoder compresses the data into the middle layer that is a latent vector.  The decoder decompresses the data from the latent vector. The latent vector represents a compressed version of the original data that the decoder can restore new data from it. The autoencoder orchestrates to train both encoder and decoder. In this tutorial, we'll briefly learn how to build a simple autoencoder with Keras API in R. The tutorial covers:
  1. Preparing the data
  2. Defining Encoder and Decoder
  3. Defining Autoencoder
  4. Generating with Autoencoder
  5. Source code listing
   We'll start by loading the required Keras package in R. Note that for this tutorial we need the R interface of Keras API and RStudio.

library(keras)


Preparing the data

   We'll use MNIST handwritten digit dataset to train the autoencoder. First, we'll load it and prepare it by doing some changes. Autoencoder requires only input data so that we only focus on x part of the dataset. We'll scale it into the range of [0, 1].

c(c(xtrain, ytrain), c(xtest, ytest)) %<-% dataset_mnist()
xtrain = xtrain/255
xtest = xtest/255

Next, we'll define the input data size that comes from image dimension and latent vector size to use in a model later.

input_size = dim(xtrain)[2]*dim(xtrain)[3]
latent_size = 10
print(input_size)
[1] 784

We'll change the shape of input data by using the input size above.

x_train = array_reshape(xtrain, dim=c(dim(xtrain)[1], input_size))
x_test = array_reshape(xtest, dim=c(dim(xtest)[1], input_size))
 
print(dim(x_train))
[1] 60000   784
 
print(dim(x_test))
[1] 10000   784


Defining Encoder and Decoder

   We'll define the encoder starting from the input layer. Encoder contains the Dense layer and ReLU leaky activations. The last output layer defines the latent vector size.

enc_input = layer_input(shape = input_size)
enc_output = enc_input %>% 
  layer_dense(units=256, activation = "relu") %>% 
  layer_activation_leaky_relu() %>% 
  layer_dense(units=latent_size) %>% 
  layer_activation_leaky_relu()

encoder = keras_model(enc_input, enc_output)
summary(encoder)
________________________________________________________________________________
Layer (type)                        Output Shape                    Param #     
================================================================================
input_11 (InputLayer)               (None, 784)                     0           
________________________________________________________________________________
dense_15 (Dense)                    (None, 256)                     200960      
________________________________________________________________________________
leaky_re_lu_15 (LeakyReLU)          (None, 256)                     0           
________________________________________________________________________________
dense_16 (Dense)                    (None, 10)                      2570        
________________________________________________________________________________
leaky_re_lu_16 (LeakyReLU)          (None, 10)                      0           
================================================================================
Total params: 203,530
Trainable params: 203,530
Non-trainable params: 0
________________________________________________________________________________


Next, we'll define the decoder starting from the input layer. The decoder also contains a similar Dense layer and ReLU leaky activations. The last output layer defines the input size and Sigmoid activation. 

dec_input = layer_input(shape = latent_size)
dec_output = dec_input %>% 
  layer_dense(units=256, activation = "relu") %>% 
  layer_activation_leaky_relu() %>% 
  layer_dense(units = input_size, activation = "sigmoid") %>% 
  layer_activation_leaky_relu()

decoder = keras_model(dec_input, dec_output)
 
summary(decoder)
________________________________________________________________________________
Layer (type)                        Output Shape                    Param #     
================================================================================
input_10 (InputLayer)               (None, 10)                      0           
________________________________________________________________________________
dense_13 (Dense)                    (None, 256)                     2816        
________________________________________________________________________________
leaky_re_lu_13 (LeakyReLU)          (None, 256)                     0           
________________________________________________________________________________
dense_14 (Dense)                    (None, 784)                     201488      
________________________________________________________________________________
leaky_re_lu_14 (LeakyReLU)          (None, 784)                     0           
================================================================================
Total params: 204,304
Trainable params: 204,304
Non-trainable params: 0
________________________________________________________________________________


Defining Autoencoder

   Finally, we'll define the autoencoder. It is a combination of encoder and decoder with an additional input layer.

aen_input = layer_input(shape = input_size)
aen_output = aen_input %>% 
  encoder() %>% 
  decoder()
   
aen = keras_model(aen_input, aen_output)
summary(aen)
________________________________________________________________________________
Layer (type)                        Output Shape                    Param #     
================================================================================
input_12 (InputLayer)               (None, 784)                     0           
________________________________________________________________________________
model_11 (Model)                    (None, 10)                      203530      
________________________________________________________________________________
model_10 (Model)                    (None, 784)                     204304      
================================================================================
Total params: 407,834
Trainable params: 407,834
Non-trainable params: 0
________________________________________________________________________________

Here you can see that the vector with the size of 784 is shrinking to the 10 and returning back to the 784 again. We'll add the Rmsprop optimizer to train the model and train it on x training data.

aen %>% compile(optimizer="rmsprop", loss="binary_crossentropy")

aen %>% fit(x_train,x_train, epochs=50, batch_size=256)
 

Generating with Autoencoder

   After the training, we can generate x test data with encoder and decoder. First, we'll encode then decode the data.

encoded_imgs = encoder %>% predict(x_test)
decoded_imgs = decoder %>% predict(encoded_imgs)

We need to change the shape of the decoded data.

pred_images = array_reshape(decoded_imgs, dim=c(dim(decoded_imgs)[1], 28, 28))

Finally, we'll visualize the original x test images and the reconstructed images in a plot. 

n = 10
op = par(mfrow=c(12,2), mar=c(1,0,0,0))
for (i in 1:n) 
{
  plot(as.raster(pred_images[i,,]))
  plot(as.raster(xtest[i,,]))
}


In this plot, the digits on the left side are the reconstructed images and the digits on the right side are the original digits. You can alter the output order by changing the plot settings. 

   In this tutorial, we've briefly learned how to build a simple autoencoder with Keras in R. The full source code is listed below.


Source code listing

library(keras)
library(caret)


c(c(xtrain, ytrain), c(xtest, ytest)) %<-% dataset_mnist()
xtrain = xtrain/255
xtest = xtest/255 
 
input_size = dim(xtrain)[2]*dim(xtrain)[3]
latent_size = 10
print(input_size) 
 
x_train = array_reshape(xtrain, dim=c(dim(xtrain)[1], input_size))
x_test = array_reshape(xtest, dim=c(dim(xtest)[1], input_size))
 
print(dim(x_train))
print(dim(x_test)) 
 
enc_input = layer_input(shape = input_size)
enc_output = enc_input %>% 
  layer_dense(units=256, activation = "relu") %>% 
  layer_activation_leaky_relu() %>% 
  layer_dense(units=latent_size) %>% 
  layer_activation_leaky_relu()

encoder = keras_model(enc_input, enc_output)
summary(encoder) 
 
dec_input = layer_input(shape = latent_size)
dec_output = dec_input %>% 
  layer_dense(units=256, activation = "relu") %>% 
  layer_activation_leaky_relu() %>% 
  layer_dense(units = input_size, activation = "sigmoid") %>% 
  layer_activation_leaky_relu()

decoder = keras_model(dec_input, dec_output)
 summary(decoder) 
 
aen_input = layer_input(shape = input_size)
aen_output = aen_input %>% 
  encoder() %>% 
  decoder()
   
aen = keras_model(aen_input, aen_output)
summary(aen)
  
aen %>% compile(optimizer="rmsprop", loss="binary_crossentropy")

aen %>% fit(x_train,x_train, epochs=50, batch_size=256) 
 
encoded_imgs = encoder %>% predict(x_test)
decoded_imgs = decoder %>% predict(encoded_imgs)
 
pred_images = array_reshape(decoded_imgs, dim=c(dim(decoded_imgs)[1], 28, 28)) 
 
n = 10
op = par(mfrow=c(12,2), mar=c(1,0,0,0))
for (i in 1:n) 
{
  plot(as.raster(pred_images[i,,]))
  plot(as.raster(xtest[i,,]))
}


Reference:

Building autoencoders in Keras

1 comment:

  1. layer_dense(units=256, activation = "relu") %>% layer_activation_leaky_relu()
    I think activation = "relu" should be removed when there follows a leaky relu layer.
    Otherwise the leakly relu layer can only get input >= 0. Pardon me if I am wrong

    ReplyDelete