Landmark Classification & Tagging

📌 Project Overview

In the age of social media, automatically identifying the location of a photo is a valuable feature for tagging and organization. This project focuses on building a robust landmark classifier capable of identifying global landmarks from raw image data.

The project was implemented in two main phases:

CNN from Scratch: Designing and training a custom Convolutional Neural Network architecture to achieve a baseline accuracy of over 50% on 50 landmark classes.
Transfer Learning: Leveraging a pre-trained ResNet18 model to significantly boost classification performance through fine-tuning.

Technologies Used: Python, PyTorch, TorchScript, Jupyter Notebooks, Matplotlib, Seaborn.

🚀 Key Features & Implementation

1. Custom CNN Architecture

Developed a multi-block CNN backbone inspired by VGG architectures:

Convolutional Blocks: Multiple layers with $3 \times 3$ filters and padding to maintain spatial dimensions.
Batch Normalization: Integrated to accelerate training and improve stability.
Adaptive Average Pooling: Used to produce a fixed-size embedding vector regardless of input image dimensions.
MLP Head: A fully connected head with Dropout to prevent overfitting.

2. High-Performance Transfer Learning

Replaced the custom backbone with a pre-trained ResNet18 model:

Feature Extraction: Froze the early layers of ResNet to preserve generalized visual features learned from ImageNet.
Classifier Refinement: Redesigned the final linear layer to map to the specific 50 landmark categories.
Optimization: Achieved a test accuracy of ~73%, a significant improvement over the from-scratch model.

3. Advanced Preprocessing Pipeline

Implemented a robust data augmentation strategy to improve model generalization:

Training Augmentation: Utilized RandomResizedCrop, RandomHorizontalFlip, ColorJitter, and RandAugment.
Normalization: Normalized images using the mean and standard deviation of the entire dataset to aid gradient descent.

4. Production Readiness with TorchScript

Model Export: Exported the trained model using TorchScript for efficient deployment without requiring a Python interpreter.
Predictor Wrapper: Engineered a Predictor class that encapsulates the preprocessing, inference, and post-processing (Softmax) logic into a single serialized artifact.

📊 Evaluation

The model’s performance was analyzed using:

Confusion Matrix: Visualizing class-wise precision and recall to identify similar-looking landmarks.
Top-5 Accuracy: Measuring the model’s ability to include the correct landmark within its top 5 predictions.

💡 What I Learned

Architecture Design Trade-offs: Gained experience in balancing model depth and complexity with training time and resource constraints.
The Power of Transfer Learning: Empirically verified how pre-trained models can drastically reduce the amount of data and compute needed for specialized tasks.
Deployment Best Practices: Learned to use TorchScript to bridge the gap between experimentation in Jupyter and production-ready code.

Landmark Classification & Tagging

📌 Project Overview

🚀 Key Features & Implementation

1. Custom CNN Architecture

2. High-Performance Transfer Learning

3. Advanced Preprocessing Pipeline

4. Production Readiness with TorchScript

📊 Evaluation

💡 What I Learned

More Projects

Interius: Agentic API Generator

Meridian Policy Intelligence

Neural-Based Static Malware Analysis Engine

Landmark Classification & Tagging for Social Media

📌 Project Overview

🚀 Key Features & Implementation

1. Custom CNN Architecture

2. High-Performance Transfer Learning

3. Advanced Preprocessing Pipeline

4. Production Readiness with TorchScript

📊 Evaluation

💡 What I Learned

More Projects

Interius: Agentic API Generator

Meridian Policy Intelligence

Neural-Based Static Malware Analysis Engine