Image-To-Text

TL;DR: The model was trained on COCO2014 and then on IntaPIC-1.1M. After this, following the amazing guide (russian) the model was deployed on GoogleCloud wrapped into Telegram bot, so you can try it until April 2021 (credits will run out =)).

it seems that current weights that are used by the bot was taken from older checkpoint (when fixed hard memory error on the VM), so the captions can differ

This repository contains implementation of simple image to text framework with pretrained encoder and LSTM as decoder.

More specially:

Encoder: pretrained ResNet 101 (the last classification layer was cutted off and linear layer followed by batch normalization was added) [https://arxiv.org/abs/1512.03385]
Decoder: 3-layer unidirectional (trained using teacher forcing) LSTM [Hochreiter & Schmidhuber. Long Short-term Memory ]
Decoder embedding matrix was initialized by the GloVe vectors (Wikipedia 2014 + Gigaword 5: 6B tokens) [https://nlp.stanford.edu/pubs/glove.pdf]

The transformer_research branch contains attemps to use Transformer Encoder-Decoder [https://arxiv.org/abs/1706.03762] architecture instead of LSTM, but the current solution is not valid.

Some generated captions of two models for the same pictures:

COCO	InstaPIC

Some funny captions by InstaPIC model:

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
capture_examples		capture_examples
.gitignore		.gitignore
README.md		README.md
data_loader.py		data_loader.py
get_glove_dict.py		get_glove_dict.py
instadata_parser.py		instadata_parser.py
learner.py		learner.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py
vocabulary.py		vocabulary.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-To-Text

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image-To-Text

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages