Skip to content

koren-v/Image2Text

Repository files navigation

Image-To-Text

TL;DR: The model was trained on COCO2014 and then on IntaPIC-1.1M. After this, following the amazing guide (russian) the model was deployed on GoogleCloud wrapped into Telegram bot, so you can try it until April 2021 (credits will run out =)).

it seems that current weights that are used by the bot was taken from older checkpoint (when fixed hard memory error on the VM), so the captions can differ

This repository contains implementation of simple image to text framework with pretrained encoder and LSTM as decoder.

More specially:

The transformer_research branch contains attemps to use Transformer Encoder-Decoder [https://arxiv.org/abs/1706.03762] architecture instead of LSTM, but the current solution is not valid.


Some generated captions of two models for the same pictures:

COCO InstaPIC

Some funny captions by InstaPIC model:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages