Skip to content

NKumar-B/CODSOFT_ImageCaptioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Image Captioning

Project Overview

This project implements an AI application that bridges Computer Vision and Natural Language Processing (NLP) to automatically generate descriptive text captions for images. Instead of training a model from scratch, this solution leverages a state-of-the-art pre-trained transformer model to achieve high accuracy and performance.

Key Features

  • Image Processing: Loads and processes standard image formats (JPEG, PNG).
  • Transformer Architecture: Utilizes the Bootstrapping Language-Image Pre-training (BLIP) model for robust feature extraction and text generation.
  • Interactive CLI: Provides a simple command-line interface for users to input image paths and receive instant captions.

Technologies & Libraries Used

  • Language: Python 3.x
  • Deep Learning Framework: torch (PyTorch)
  • Pre-trained Models: Salesforce/blip-image-captioning-base via the Hugging Face transformers library
  • Image Handling: Pillow (PIL)

How to Run Locally

1. Install Dependencies

Before running the script, ensure you have the required external libraries installed:

pip install transformers torch Pillow

2. Execute the Script

Run the Python file from your terminal:

python image_captioning.py

3. Generate Captions

When prompted, provide the absolute or relative path to the image file you want to analyze. The AI will download the model (on the first run only), process the image, and output a generated caption.

Author

Nithin Kumar Badduluri

About

This project implements an AI application that bridges Computer Vision and Natural Language Processing (NLP) to automatically generate descriptive text captions for images.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages