When it comes to image data, principally we use the convolutional neural networks in building the deep learning model. In the previous post, we learned how to build simple autoencoders with dense layers. In this tutorial, we'll learn how to build autoencoders by applying the convolutional neural networks with Keras in Python. The tutorial covers:
- Preparing the data
- Defining the convolutional autoencoder
- Generating the images
- Source code listing
from keras.layers import Conv2D from keras.layers import Input
from keras.layers import MaxPooling2D, UpSampling2D from keras.models import Model from keras.datasets.mnist import load_data from numpy import reshape import matplotlib.pyplot as plt
Preparing the data
We'll use MNIST handwritten digits dataset to train the autoencoder. First, we'll load it and prepare it by doing some changes. Autoencoder requires only input data so that we only focus on x part of the dataset. We'll scale it into the range of [0, 1].
(xtrain, _), (xtest, _) = load_data() xtrain = xtrain.astype('float32') / 255 xtest = xtest.astype('float32') / 255
print(xtrain.shape, xtest.shape)
(60000, 28, 28) (10000, 28, 28)
For the two-dimensional convolutional layer, we need to add one more dimension to the dataset. We can do it by using the reshape function.
x_train = reshape(xtrain, (len(xtrain), 28, 28, 1)) x_test = reshape(xtest, (len(xtest), 28, 28, 1))
print(x_train.shape, x_test.shape)
(60000, 28, 28, 1) (10000, 28, 28, 1)
Defining the convolutional autoencoder
We'll define the autoencoder starting from the input layer. The input layer has a shape similar to the dimensions of the input data.
input_img = Input(shape=(28, 28, 1))
The encoding part of the autoencoder contains the convolutional and max-pooling layers to decode the image. The max-pooling layer decreases the sizes of the image by using a pooling function.
The decoding part of the autoencoder contains convolutional and upsampling layers. The up-sampling layer helps to reconstruct the sizes of the image. It is the opposite of the pooling function. The last convolutional layer holds sigmoid activation. Then, we'll combine both layers into the final autoencoder model and compile it with the RMSProp optimizer and binary cross-entropy loss function.
enc_conv1 = Conv2D(12, (3, 3), activation='relu', padding='same')(input_img) enc_pool1 = MaxPooling2D((2, 2), padding='same')(enc_conv1) enc_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_pool1) enc_ouput = MaxPooling2D((4, 4), padding='same')(enc_conv2) dec_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_ouput) dec_upsample2 = UpSampling2D((4, 4))(dec_conv2) dec_conv3 = Conv2D(12, (3, 3), activation='relu')(dec_upsample2) dec_upsample3 = UpSampling2D((2, 2))(dec_conv3) dec_output = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(dec_upsample3)
autoencoder = Model(input_img, dec_output) autoencoder.compile(optimizer='rmsprop', loss='binary_crossentropy')
autoencoder.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) (None, 28, 28, 1) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 28, 28, 12) 120 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 14, 14, 12) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 14, 14, 8) 1544 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 4, 4, 8) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 4, 4, 8) 1032 _________________________________________________________________ up_sampling2d_1 (UpSampling2 (None, 16, 16, 8) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 14, 14, 12) 876 _________________________________________________________________ up_sampling2d_2 (UpSampling2 (None, 28, 28, 12) 0 _________________________________________________________________ conv2d_5 (Conv2D) (None, 28, 28, 1) 109 ================================================================= Total params: 3,681 Trainable params: 3,681 Non-trainable params: 0 _________________________________________________________________The model is ready, now we can fit on training data.
autoencoder.fit(x_train, x_train, epochs=20, batch_size=128, shuffle=True)
Generating the images
Finally, we'll restore the test data and visualize them in a plot. We'll check the result visualizing in a plot.
decoded_imgs = autoencoder.predict(x_test)
n = 10 plt.figure(figsize=(20, 4)) for i in range(n): plt.gray() ax = plt.subplot(2, n, i+1) plt.imshow(x_test[i].reshape(28, 28)) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) ax = plt.subplot(2, n, i +1+n) plt.imshow(decoded_imgs[i].reshape(28, 28)) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) plt.show()
The first row in a plot shows the original images in test data. The second row contains the restored data with the autoencoder model.
In this tutorial, we've briefly learned how to build a convolutional autoencoder with Keras in Python. The full source code is listed below.
Source code listing
from keras.layers import Conv2D from keras.layers import Input from keras.layers import MaxPooling2D, UpSampling2D from keras.models import Model from keras.datasets.mnist import load_data from numpy import reshape import matplotlib.pyplot as plt (xtrain, _), (xtest, _) = load_data() xtrain = xtrain.astype('float32') / 255 xtest = xtest.astype('float32') / 255 print(xtrain.shape, xtest.shape) x_train = reshape(xtrain, (len(xtrain), 28, 28, 1)) x_test = reshape(xtest, (len(xtest), 28, 28, 1)) print(x_train.shape, x_test.shape) input_img = Input(shape=(28, 28, 1)) enc_conv1 = Conv2D(12, (3, 3), activation='relu', padding='same')(input_img) enc_pool1 = MaxPooling2D((2, 2), padding='same')(enc_conv1) enc_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_pool1) enc_ouput = MaxPooling2D((4, 4), padding='same')(enc_conv2) dec_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_ouput) dec_upsample2 = UpSampling2D((4, 4))(dec_conv2) dec_conv3 = Conv2D(12, (3, 3), activation='relu')(dec_upsample2) dec_upsample3 = UpSampling2D((2, 2))(dec_conv3) dec_output = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(dec_upsample3) autoencoder = Model(input_img, dec_output) autoencoder.compile(optimizer='rmsprop', loss='binary_crossentropy') autoencoder.summary()
autoencoder.fit(x_train, x_train, epochs=20, batch_size=128, shuffle=True) decoded_imgs = autoencoder.predict(x_test) n = 10 plt.figure(figsize=(20, 4)) for i in range(n): plt.gray() ax = plt.subplot(2, n, i+1) plt.imshow(x_test[i].reshape(28, 28)) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) ax = plt.subplot(2, n, i +1+n) plt.imshow(decoded_imgs[i].reshape(28, 28)) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) plt.show()
References:
No comments:
Post a Comment