DataTechNotes: Understanding Max-Pooling of Image Data with R

Convolutional Neural Network (CNN) model often contains max-pooling layer in an image classification and object detection applications. Max-pooling reduces the image dimension by extracting the highest value in a region identified by max pooling filter. The purpose of using max pooling operation is to reduce the number of parameters in the model and keep essential features of an image. Fewer parameters decrease the complexity of the model and its computing time. Pooling is performed according to given filter size (such as 2x2, 3x3, 5x5) and stride value (1, 2, 3).
Let's see an example. There is 4x4 image matrix data as input, and we perform max pooling operation with 2x2 filter, and stride value 2 that is (2x2, 2). To get the max value, a window area (2x2) is selected, and the max value in a window is extracted. The same process goes by shifting window position with a step value 2. Finally, we get 2x2 dimension matrix out of a 4x4 matrix.

In this tutorial, we apply pooling operation into the given an image and check the result in R. An 'EBImage' library, and test image file are required for this tutorial. If the library is not available on your PC, please install it.
First, we load an image file and resize its dimension to 100 by 100 (the below pooling function only uses equal-sized images e.g. 10x10, 50x50).

library(EBImage)

orig <- readImage("bus20.jpg")  # load image 
img <- resize(orig, 100, 100)  # make it smaller and equal dimension

Next, we'll write a simple pooling function that performs a max or a mean pooling operation on a given image. Pooling filter and strides should be provided as an argument.

pooling <- function(type="max",image, filter, stride)
{
    f <- filter; s <- stride 
    col <- dim(image[,,1])[2]  # get image dimensions
    row <- dim(image[,,1])[1]
    c <- (col-f)/s+1             # calculate new dimension size 
    r <- (row-f)/s+1  
    
    newImage <- array(0, c(c, r, 3)) # create new image object
    for(rgb in 1:3)                  # loops in RGB layers 
    {
        m <- image[,,rgb]
        m3 <- matrix(0, ncol = c, nrow = r)
        i <- 1
        if(type == "mean")
            for(ii in 1:r)
            {
                j <- 1
                for(jj in 1:c)
                {
                    m3[ii,jj]<-mean(as.numeric(m[i:(i+(f-1)), j:(j+(f-1))]))
                    j <- j+s
                } 
                i <- i+s
            }
        else 
            for(ii in 1:r)
            {
                j=1
                for(jj in 1:c)
                {
                    m3[ii,jj]<-max(as.numeric(m[i:(i+(f-1)), j:(j+(f-1))]))
                    j <- j+s
                } 
                i <- i+s
            }
        newImage[,,rgb] <- m3
    }
    return(newImage)
}

By using the above function, we can do max or mean pooling operation. First, we'll pool image with (2,2) filter and stride value 2, check the images dimension, visualize the result in a plot.

pmax <- pooling(image = img, filter = 2, stride = 2)
pmax <- aperm(pmax, c(2,1,3))

pmean <- pooling(type = "mean", image = img, filter = 2, stride = 2)
pmean <- aperm(pmean, c(2,1,3))
 
print(dim(img))
[1] 100 100   3
print(dim(pmax))
[1] 50 50  3
print(dim(pmean))
[1] 50 50  3

par(mfrow = c(2, 2))
plot(img)
plot(img)
plot(as.raster(pmax), title = title("Max-Pooled (c(2,2),2)"))
plot(as.raster(pmean), title = title("Mean-Pooled (c(2,2),2)"))

Next, we'll pool image with (3,3) filter and stride value 2, and check the images dimension, visualize the result.

pmax <- pooling(image = img, filter = 3, stride = 2)
pmax <- aperm(pmax, c(2,1,3))
 
pmean <- pooling(type = "mean", image = img, filter = 3, stride = 2)
pmean <- aperm(pmean, c(2,1,3))
 
print(dim(img))
[1] 100 100   3
print(dim(pmax))
[1] 49 49  3
print(dim(pmean))
[1] 49 49  3
 
par(mfrow = c(2, 2))
plot(img)
plot(img)
plot(as.raster(pmax), title = title("Max-Pooled (c(3,3),2)"))
plot(as.raster(pmean), title = title("Mean-Pooled (c(3,3),2)"))

You may try it by changing the filter and stride values and observe the outcomes.
Thank you for reading, I hope you've found it useful!

The full source code is listed below.

library(EBImage)

orig <- readImage("bus20.jpg")  # load image 
img <- resize(orig, 100, 100)  # make it smaller and equal dimension
pooling <- function(type="max",image, filter, stride)
{
    f <- filter; s <- stride 
    col <- dim(image[,,1])[2]  # get image dimensions
    row <- dim(image[,,1])[1]
    c <- (col-f)/s+1             # calculate new dimension size 
    r <- (row-f)/s+1  
    
    newImage <- array(0, c(c, r, 3)) # create new image object
    for(rgb in 1:3)                  # loops in RGB layers 
    {
        m <- image[,,rgb]
        m3 <- matrix(0, ncol = c, nrow = r)
        i <- 1
        if(type == "mean")
            for(ii in 1:r)
            {
                j <- 1
                for(jj in 1:c)
                {
                    m3[ii,jj]<-mean(as.numeric(m[i:(i+(f-1)), j:(j+(f-1))]))
                    j <- j+s
                } 
                i <- i+s
            }
        else 
            for(ii in 1:r)
            {
                j=1
                for(jj in 1:c)
                {
                    m3[ii,jj]<-max(as.numeric(m[i:(i+(f-1)), j:(j+(f-1))]))
                    j <- j+s
                } 
                i <- i+s
            }
        newImage[,,rgb] <- m3
    }
    return(newImage)
}
 
pmax <- pooling(image = img, filter = 2, stride = 2)
pmax <- aperm(pmax, c(2,1,3))
pmean <- pooling(type = "mean", image = img, filter = 2, stride = 2)
pmean <- aperm(pmean, c(2,1,3))

print(dim(img))
print(dim(pmax))
print(dim(pmean))
 
par(mfrow = c(2, 2))
plot(img)
plot(img)
plot(as.raster(pmax), title = title("Max-Pooled (c(2,2),2)"))
plot(as.raster(pmean), title = title("Mean-Pooled (c(2,2),2)"))

DataTechNotes

Pages

Understanding Max-Pooling of Image Data with R

No comments:

Post a Comment