Fine-tuning Convolutional Neural Networks for Image Recognition of 25 Bird Species
The bird species dataset contains diverse images across 25 different bird species with the following characteristics:
Total Images
Training Images
Testing Images
Each image was standardized to 224×224 pixels with 3 color channels. The data preparation process involved transforming raw bird images into a format suitable for deep learning, followed by data normalization to improve model performance.
# Data transformation pipeline transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize((.229, .224, .225), (.485, .456, .406)) # ImageNet normalization ])
I implemented and compared three popular CNN architectures, each with different depths and architectural characteristics:
The winner of the 2012 ImageNet challenge with 8 layers. Used as our baseline model for comparison.
Winner of the 2014 ImageNet challenge with 16 layers and small 3×3 filters throughout.
The 2015 ImageNet challenge winner with 152 layers and special skip connections.
A key aspect of this project was evaluating different optimization algorithms to find the most effective approach for each model. I tested three optimizers:
Adaptive Moment Estimation combines benefits of AdaGrad and RMSProp.
Stochastic Gradient Descent with momentum.
Root Mean Square Propagation adapts learning rates based on recent gradients.
# Optimizer configuration example for ResNet with RMSprop
learning_rate = 1e-4
optimizer = torch.optim.RMSprop(cnn_model.parameters(), lr=learning_rate)
loss_fn = nn.CrossEntropyLoss()
Accuracy curves for AlexNet, VGGNet and ResNet across different optimizers showing training and test accuracy over epochs.
Training and testing loss curves showing convergence patterns for each model-optimizer combination.
Confusion matrices visualizing classification performance across all 25 bird species.
All models were implemented using PyTorch and fine-tuned from pre-trained weights on ImageNet. Each model underwent architectural modifications to classify our 25 bird species:
class ResNet152(nn.Module):
def __init__(self, num_classes, pretrained=True):
super(ResNet152, self).__init__()
net = models.resnet152(pretrained=True)
# Modify final layer for our classification task
num_features = net.fc.in_features
net.fc = nn.Linear(num_features, num_classes)
# Transfer components from pre-trained model
self.conv1 = net.conv1
self.bn1 = net.bn1
self.relu = net.relu
self.maxpool = net.maxpool
self.layer1 = net.layer1
self.layer2 = net.layer2
self.layer3 = net.layer3
self.layer4 = net.layer4
self.avgpool = net.avgpool
self.fc = net.fc
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.fc(x)
return x
Throughout this project, I gained valuable insights into:
After comprehensive testing across all model-optimizer combinations, these were the key findings:
Based on my research, I found that ResNet with RMSprop is the most effective combination for bird species classification, achieving: