Skip to content

jonleinena/mask2former

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vision Transformer Training and Inference

Welcome to the Vision Transformer Training and Inference repository! This project aims to provide training scripts for various pretrained vision transformers like Mask2Former and SegFormer. Additionally, we will implement different inference pipelines for these models.

Table of Contents

Introduction

Vision transformers have revolutionized the field of computer vision by leveraging the power of transformers for image processing tasks. This repository provides scripts to train and perform inference using state-of-the-art vision transformers like Mask2Former and SegFormer.

Features

  • Training Scripts: Easily train vision transformers on your custom datasets.
  • Inference Pipelines: Perform inference using trained models.
  • Customizable: Modify training parameters and augmentations to suit your needs.
  • Preprocessing: Includes image preprocessing and augmentation techniques.

Installation

To get started, clone the repository and install the required dependencies:

git clone https://github.com/jonleinena/mask2former.git
cd mask2former
pip install -r requirements.txt

Usage

Training

To train a model, use the train.py script. You can specify various parameters such as dataset path, image size, model name, and more.

python3 train.py --dataset_path /path/to/dataset --img_size 1024 1024 --model_name_or_path facebook/mask2former-swin-small-ade-semantic --output_path weights --learning_rate 0.0001 --epochs 10

Inference

Inference scripts will be added soon. Stay tuned!

To-Do List

  • Implement training script for Mask2Former
  • Implement training script for SegFormer
  • Add inference pipeline for Mask2Former
  • Add inference pipeline for SegFormer
  • Add support for more vision transformers
  • Improve documentation and add examples

Contributing

It's still early stage so no contributions asked. Maybe will open up space for them in the future.

License

This project is licensed under the MIT License. See the LICENSE file for details.

References

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages