LLMSeqRec: An LLM-Enhanced Contextual Sequential Recommender

Below is a sample README.md file for your project. You can place this file in the root directory of your project.

LLMSeqRec: An LLM-Enhanced Contextual Sequential Recommender

LLMSeqRec is an innovative framework that integrates Large Language Model (LLM)-generated semantic embeddings with traditional sequential recommendation architectures. By leveraging deep semantic representations extracted from rich textual item metadata, LLMSeqRec improves recommendation accuracy, especially in cold-start and sparse data scenarios. This project implements both the proposed LLMSeqRec and a baseline SASRec model, using the MovieLens dataset to evaluate performance.

Overview

LLMSeqRec enhances sequential recommendation by fusing two embedding modalities:

LLM-based Embeddings: Pretrained semantic embeddings generated from item descriptions.
ID-based Embeddings: Learnable collaborative embeddings.

The model employs a Transformer-based architecture (inspired by SASRec) with causal masking and multiple self-attention layers. This project includes training and evaluation scripts for both LLMSeqRec and a baseline SASRec model, alongside modules for logging and result visualization.

Folder Structure

LLMSeqRec/
├── data/
│   ├── raw/                   # Raw MovieLens dataset files
│   └── processed/             # Preprocessed CSV files (train_sequences.csv, val_sequences.csv) and LLM embeddings (llm_embeddings.npy)
├── eval/
│   ├── evaluate.py            # Evaluation script for LLMSeqRec (and baseline, if needed)
│   └── analysis.py            # Scripts for plotting training and evaluation results
├── logs/
│   ├── llmseqrec_train_log.csv
│   ├── sasrec_train_log.csv
│   ├── llmseqrec_metrics.csv
│   └── sasrec_metrics.csv
├── models/
│   ├── llmseqrec.py           # LLMSeqRec model implementation
│   └── sasrec.py              # Baseline SASRec model implementation
├── train/
│   ├── train_llmseqrec.py     # Training script for LLMSeqRec model
│   └── train_sasrec.py        # Training script for SASRec baseline
├── README.md
└── sources.bib                # BibTeX references for literature review

Requirements

Python 3.7+
TensorFlow 2.x
Pandas, NumPy
Matplotlib (for plotting)
Additional packages as required (see requirements.txt if available)

Installation

Clone the repository:

git clone https://github.com/yourusername/LLMSeqRec.git
cd LLMSeqRec

(Optional) Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install required packages:
```
pip install -r requirements.txt
```

Data Preparation

Download the MovieLens dataset from GroupLens.
Place raw files in the data/raw/ folder.
Run the data preprocessing scripts to generate processed CSV files and compute LLM embeddings (stored in data/processed/).

Usage

Training

To train the LLMSeqRec model:

python -m LLMSeqRec.train.train_llmseqrec

To train the SASRec baseline:
```
python -m LLMSeqRec.train.train_sasrec
```

Training logs and metrics are saved in the logs/ folder.

Evaluation

To evaluate the LLMSeqRec model on the validation set:

python -m LLMSeqRec.eval.evaluate --val_csv "data/processed/val_sequences.csv" --emb_path "data/processed/llm_embeddings.npy" --top_k 10

For the baseline, modify the import in the evaluation script accordingly.

Plotting Results

Generate visual comparisons of training loss and evaluation metrics:

python -m LLMSeqRec.eval.analysis

This script reads CSV logs from the logs/ folder and creates plots in the logs/ directory.

Future Scope

Future work may include integrating knowledge graphs to further enhance the semantic representation of items. By incorporating structured domain knowledge, such as genre hierarchies and actor/director relationships, the model could better capture intricate item interrelations and provide more explainable recommendations. Additionally, multimodal data such as images and reviews could be integrated to create richer, multi-dimensional embeddings. Optimizing the model for real-time recommendation and scaling it to larger, more diverse datasets also represents promising avenues for further research.

Citation

Please refer to sources.bib for detailed citations of all referenced research papers.

This README file provides a comprehensive overview of the project, guiding users through setup, usage, and potential future enhancements. Let me know if you need further modifications!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLMSeqRec: An LLM-Enhanced Contextual Sequential Recommender

Table of Contents

Overview

Folder Structure

Requirements

Installation

Data Preparation

Usage

Training

Evaluation

Plotting Results

Future Scope

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
embeddings		embeddings
eval		eval
logs		logs
models		models
scripts		scripts
train		train
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.text		run.text

thenileshmishra/LLMSeqRec

Folders and files

Latest commit

History

Repository files navigation

LLMSeqRec: An LLM-Enhanced Contextual Sequential Recommender

Table of Contents

Overview

Folder Structure

Requirements

Installation

Data Preparation

Usage

Training

Evaluation

Plotting Results

Future Scope

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages