Tran Le, Stella (Seoyeon) Lee Grinnell College CSC 395 Information Retrieval
This repository contains the implementation of question-answering model which primary goals to understand sequential context.
Dependency
sysospicklecsv
numpytimere
tensorflow 2.0.0
tensorflow.keras
This folder contains fetch_glove.sh and get_bAbi.sh which is used to download GloVe pre-trained word embeddings and Facebook bAbi dataset.
- preprocessing contains python files to preprocess bAbi dataset
- baseline_model contains python files to train/test lstm models
- model contains python files of custom keras layers and models
- train.py is a python file used for training our custom QA models
get_glove.pycontains functions to loadGloVeembeddings and embedding matrixload_vectors(filepath): LoadGloVetext file to dictionary of{word:embedding}load_embedding_matrix(data_folder, dims): Load embedding vectors to an embedding matrix and create word-index mapping
process_bAbi.pycontains functions to save each of tasks into Context, Question, and Answerpreprocessing.pytransform(input, max_len, tokenizer): Returns padded sequence that is transformed frominputbytokenizer.texts_to_sequencesmain(dim, embedding_folder, data_folder)saves the embedding matrix and tokenizer


