TF03 : Go further on Convolutional Neural Networks
- Convolution neural networks can also help to classify colored images
- There are two challenges when dealing with colored images
- Colored images are of different sizes.
- Images of different colors
- Colored images have different sizes and colors, unlike the Fashion MNIST dataset with images of the same size 28X28 with one color(Greyscale).
- The neural network needs a fixed-size input, so we will resize all colored images to fixed-size.
Here we are converting images to 150X150 and flattened for neural network input.
How are we handling the different colors of the image ..?
- If a greyscale image is 6X6 then it is converted into a matrix of size 6X6 but in the colored image, we will have three dimensions.
Color images are represented by three color channels - Red,Blue, Green.
All three colored channels are combined to form a colored image and each of these colored channels is represented by a 2-dimensional array.
So the depth of the image is a stack of these two-dimensional arrays
How Convolution works on colored images..?
From TF02 - Convolution Neural network on Fashion MNIST, we learned about kernels.
we know the color image is divided into three color channels, here we will have three kernels for each color channel
Till padding the process remains the same.
The convoluted output is calculated in the following way
Final convoluted output
for one filter we have one convoluted output, if there are multiple filters then we have multiple convoluted outputs
if there are multiple filters
Now max pooling is applied to each convoluted output
For 3d convoluted output we also got 3d max pooling output
Colab code link : https://colab.research.google.com/github/tensorflow/examples/blob/master/courses/udacity_intro_to_tensorflow_for_deep_learning/l05c01_dogs_vs_cats_without_augmentation.ipynb
Sigmoid
For binary classification, sigmoid works well, so for the last layer we will make the following change:
tf.keras.layers.Dense(1, activation='sigmoid')
If you choose the sigmoid activation function then we have to change the loss function
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Overfitting issue
The CNN model prediction with training data is 96% whereas with test data
it is 76%. This shows CNN model is overfitting and remembering the training set.
Validation DataSet
we will use the validation set to tune the weights of the training set to avoid overfitting
Here we can see the CNN accuracy is 99% but with the validation set it is 70%.
this is a clear sign of overfitting.
Image Augmentation
We want to have our CNN model predict images irrespective of their size and position
for that, we need a big data set with all kinds of images but that is not practical. we can solve this
problem using Image augmentation. we will transform the image in multiple varieties to increase the accuracy of the prediction
Dropout
In neural networks, weights play a good role in the prediction of the output ,
in the diagram below some neurons play a major role than the other
because some weights in a particular neuron are greater than other neuron weights.
This means some neurons don't play any role in predictions and this problem
can be resolved using dropout. Dropout is turning off some neurons during training
this makes other neurons take an active part in the training.
we randomly off the neurons on each epoch
Resizing: When working with images of different sizes, you must resize all the images to the same size so that they can be fed into a CNN.
Color Images: Computers interpret color images as 3D arrays.
RGB Image: Color image composed of 3 color channels: Red, Green, and Blue.
Convolutions: When working with RGB images we convolve each color channel with its own convolutional filter. Convolutions on each color channel are performed in the same way as with grayscale images, i.e. by performing element-wise multiplication of the convolutional filter (kernel) and a section of the input array. The result of each convolution is added up together with a bias value to get the convoluted output.
Max Pooling: When working with RGB images we perform max pooling on each color channel using the same window size and stride. Max pooling on each color channel is performed in the same way as with grayscale images, i.e. by selecting the max value in each window.
Validation Set: We use a validation set to check how the model is doing during the training phase. Validation sets can be used to perform Early Stopping to prevent overfitting and can also be used to help us compare different models and choose the best one.
Methods to Prevent Overfitting:
Early Stopping: In this method, we track the loss on the validation set during the training phase and use it to determine when to stop training such that the model is accurate but not overfitting.
Image Augmentation: Artificially boosting the number of images in our training set by applying random image transformations to the existing images in the training set.
Dropout: Removing a random selection of a fixed number of neurons in a neural network during training.
References:
Comments
Post a Comment