- Preparing the data
- Defining the autoencoder model
- Restoring the image
- Source code listing
from keras.layers import Dense from keras.layers import Input, LeakyReLU from keras.models import Model from keras.datasets.mnist import load_data from numpy import reshape import matplotlib.pyplot as plt
Preparing the data
We'll use MNIST handwritten digits dataset to train the autoencoder. First, we'll load it and prepare it by doing some changes. Autoencoder requires only input data so that we only focus on x part of the dataset. We'll scale it into the range of [0, 1].
(xtrain, _), (xtest, _) = load_data() xtrain = xtrain.astype('float32') / 255 xtest = xtest.astype('float32') / 255 print(xtrain.shape, xtest.shape)
(60000, 28, 28) (10000, 28, 28)
Next, we'll define input size and latent vector. We'll learn why we need the latent vector below. Then we'll reshape images data into the vector type.
input_size = xtrain.shape[1] * xtrain.shape[2] latent_size = 16
x_train = xtrain.reshape((len(xtrain), input_size)) x_test = xtest.reshape((len(xtest), input_size)) print(x_train.shape)
(60000, 784)
print(x_test.shape)
(10000, 784)
Defining the autoencoder model
Next, we'll define encoder, decoder, and autoencoder models. The encoder compresses the data into the middle layer that is a latent vector. The decoder decompresses the data from the latent vector. The latent vector represents a compressed version of the original data that the decoder can restore new data from it. The autoencoder orchestrates to train both encoder and decoder models.
We'll define the encoder starting from the input layer. Encoder contains the Dense layer and ReLU leaky activations. The last output layer defines the latent vector size.
enc_input = Input(shape=(input_size,))
enc_dense1 = Dense(units=256, activation="relu")(enc_input) enc_activ1 = LeakyReLU()(enc_dense1) enc_dense2 = Dense(units=latent_size)(enc_activ1) enc_output = LeakyReLU()(enc_dense2) encoder = Model(enc_input, enc_output)
We'll define the decoder starting from the input layer. The decoder also contains a similar Dense layer and ReLU leaky activations. The last output layer defines the input size and Sigmoid activation.
dec_input = Input(shape=(latent_size,))
dec_dense1 = Dense(units=256, activation="relu")(dec_input) dec_activ1 = LeakyReLU()(dec_dense1) dec_dense2 = Dense(units=input_size, activation='sigmoid')(dec_activ1) dec_output = LeakyReLU()(dec_dense2) decoder = Model(dec_input, dec_output) decoder.summary()
Finally, we'll define the autoencoder. It is a combination of encoder and decoder with an additional input layer.
aen_input = Input(shape=(input_size,)) aen_enc_output = encoder(aen_input) aen_dec_output = decoder(aen_enc_output) aen = Model(aen_input, aen_dec_output) aen.summary()
Layer (type) Output Shape Param # ================================================================= input_5 (InputLayer) (None, 784) 0 _________________________________________________________________ model_5 (Model) (None, 16) 205072 _________________________________________________________________ model_6 (Model) (None, 784) 205840 ================================================================= Total params: 410,912 Trainable params: 410,912 Non-trainable params: 0 _________________________________________________________________
Here you can see that the vector with the size of 784 is shrinking to the 16 and returning back to the 784 again. We'll add the Rmsprop optimizer to train the model and train it on x training data.
Now, we can compile and fit the model with train data.
aen.compile(optimizer="rmsprop", loss="binary_crossentropy") aen.fit(x_train, x_train, epochs=20, batch_size=256, shuffle=True)
Restoring the image
Finally, we'll restore the test data and visualize them in a plot. First, we'll encode the image, then predict with a decoder model. We'll check the result visualizing in a plot.
encoded_images = encoder.predict(x_test) decoded_images = decoder.predict(encoded_images) pred_images = reshape(decoded_images, newshape=(decoded_images.shape[0], 28, 28))
n = 10
plt.figure(figsize=(10, 2)) for i in range(n): ax = plt.subplot(2, n, i + 1) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) plt.imshow(xtest[i].reshape(28, 28))
plt.gray()
ax = plt.subplot(2, n, i + 1 + n) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) plt.imshow(pred_images[i].reshape(28, 28)) plt.show()The first row in a plot shows the original images in test data. The second row contains the restored data with the autoencoder model.
In this tutorial, we've briefly learned how to build s simple autoencoder with Keras in Python. The full source code is listed below.
Source code listing
from keras.layers import Dense from keras.layers import Input, LeakyReLU from keras.models import Model from keras.datasets.mnist import load_data from numpy import reshape import matplotlib.pyplot as plt (xtrain, _), (xtest, _) = load_data() xtrain = xtrain.astype('float32') / 255. xtest = xtest.astype('float32') / 255. print(xtrain.shape, xtest.shape) input_size = xtrain.shape[1] * xtrain.shape[2] latent_size = 16 x_train = xtrain.reshape((len(xtrain), input_size)) x_test = xtest.reshape((len(xtest), input_size)) print(x_train.shape) print(x_test.shape) # Encoder enc_input = Input(shape=(input_size,)) enc_dense1 = Dense(units=256, activation="relu")(enc_input) enc_activ1 = LeakyReLU()(enc_dense1) enc_dense2 = Dense(units=latent_size)(enc_activ1) enc_output = LeakyReLU()(enc_dense2) encoder = Model(enc_input, enc_output) encoder.summary() # Decoder dec_input = Input(shape=(latent_size,)) dec_dense1 = Dense(units=256, activation="relu")(dec_input) dec_activ1 = LeakyReLU()(dec_dense1) dec_dense2 = Dense(units=input_size, activation='sigmoid')(dec_activ1) dec_output = LeakyReLU()(dec_dense2) decoder = Model(dec_input, dec_output) decoder.summary() # Autoencoder aen_input = Input(shape=(input_size,)) aen_enc_output = encoder(aen_input) aen_dec_output = decoder(aen_enc_output) aen = Model(aen_input, aen_dec_output) aen.summary() aen.compile(optimizer="rmsprop", loss="binary_crossentropy") aen.fit(x_train, x_train, epochs=20, batch_size=256, shuffle=True) encoded_images = encoder.predict(x_test) decoded_images = decoder.predict(encoded_images) pred_images = reshape(decoded_images, newshape=(decoded_images.shape[0], 28, 28))
n = 10
plt.figure(figsize=(10, 2)) for i in range(n): ax = plt.subplot(2, n, i + 1) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) plt.imshow(xtest[i].reshape(28, 28))
plt.gray()
ax = plt.subplot(2, n, i + 1 + n) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) plt.imshow(pred_images[i].reshape(28, 28)) plt.show()
Reference:
Dense(units=256, activation="relu") followed by LeakyReLU makes no sense whatsoever. Likewise in the decoder, sigmoid activation followed by LeakyReLU.
ReplyDelete