TF03 : Go further on Convolutional Neural Networks

  • Convolution neural networks can also help to classify colored images
  • There are two challenges when dealing with colored images
    • Colored images are of different sizes.
    • Images of different colors
  • Colored images have different sizes and colors, unlike the Fashion MNIST dataset with images of the same size 28X28 with one color(Greyscale).
How are we handling different sizes of images ..?
  • The neural network needs a fixed-size input, so we will resize all colored images to fixed-size.

Here we are converting images to 150X150 and flattened for neural network input.



How are we handling the different colors of the image ..?

  • If a greyscale image is 6X6 then it is converted into a matrix of size 6X6 but in the colored image, we will have three dimensions.

Color images are represented by three color channels - Red,Blue, Green.


All three colored channels are combined to form a colored image and each of these colored channels is represented by a 2-dimensional array.


So the depth of the image is a stack of these two-dimensional arrays


How Convolution works on colored images..?

From TF02 - Convolution Neural network on Fashion MNIST, we learned about kernels.

we know the color image is divided into three color channels, here we will have three kernels for each color channel


Till padding the process remains the same.


   The  convoluted output is calculated in the following way


Final convoluted output 


for one filter we have one convoluted output, if there are multiple filters then we have multiple convoluted outputs


if there are multiple filters

Now max pooling is applied to each convoluted output




For 3d convoluted output we also got 3d max pooling output


Colab code link : https://colab.research.google.com/github/tensorflow/examples/blob/master/courses/udacity_intro_to_tensorflow_for_deep_learning/l05c01_dogs_vs_cats_without_augmentation.ipynb

Sigmoid

For binary classification, sigmoid works well, so for the last layer we will make the following change:

tf.keras.layers.Dense(1, activation='sigmoid')

If you choose the sigmoid activation function then we have to change the loss function

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Overfitting issue

The CNN model prediction with training data is 96% whereas with test data

it is 76%. This shows CNN model is overfitting and remembering the training set.

Validation DataSet

we will use the validation set to tune the weights of the training set to avoid overfitting


Here we can see the CNN accuracy is 99% but with the validation set it is 70%.

this is a clear sign of overfitting.

Image Augmentation

We want to have our CNN model predict images irrespective of their size and position

for that, we need a big data set with all kinds of images but that is not practical. we can solve this

problem using Image augmentation. we will transform the image in multiple varieties to increase the accuracy of the prediction


Dropout

In neural networks, weights play a good role in the prediction of the output ,

in the diagram below some neurons play a major role than the other

because some weights in a particular neuron are greater than other neuron weights.


This means some neurons don't play any role in predictions and this problem

can be resolved using dropout. Dropout is turning off some neurons during training

this makes other neurons take an active part in the training.


we randomly off the neurons on each epoch




  • Resizing: When working with images of different sizes, you must resize all the images to the same size so that they can be fed into a CNN.

  • Color Images: Computers interpret color images as 3D arrays.

  • RGB Image: Color image composed of 3 color channels: Red, Green, and Blue.

  • Convolutions: When working with RGB images we convolve each color channel with its own convolutional filter. Convolutions on each color channel are performed in the same way as with grayscale images, i.e. by performing element-wise multiplication of the convolutional filter (kernel) and a section of the input array. The result of each convolution is added up together with a bias value to get the convoluted output.

  • Max Pooling: When working with RGB images we perform max pooling on each color channel using the same window size and stride. Max pooling on each color channel is performed in the same way as with grayscale images, i.e. by selecting the max value in each window.

  • Validation Set: We use a validation set to check how the model is doing during the training phase. Validation sets can be used to perform Early Stopping to prevent overfitting and can also be used to help us compare different models and choose the best one.

Methods to Prevent Overfitting:

  • Early Stopping: In this method, we track the loss on the validation set during the training phase and use it to determine when to stop training such that the model is accurate but not overfitting.

  • Image Augmentation: Artificially boosting the number of images in our training set by applying random image transformations to the existing images in the training set.

  • Dropout: Removing a random selection of a fixed number of neurons in a neural network during training.


References:

Comments

Popular posts from this blog

Devops overview

EdgeAI

GIS Dashboard for RailwayLine Safety