Samsara is a proof of concept for a novel query optimizer aimed at enabling Multimodal Stream Processing Systems with the help of Large Language Models(LLMs). LLMs are already being employed to support multimodal query processing over databases, but their inherent latency makes them unsuitable for the time-sensitive scenarios of Stream Processing. Samsara tackles the challenge by introducing a three-step optimization process aimed at reducing the amount of information sent to the expensive AI-based operators.
To run the experiments it is required to have access to a VLLM (Visual Large Language Model).
It is also possible to run the experiments in Test
mode, but the answers of the VLLM are substituted by random choices.
The config.json
file at the root of this project contains configuration parameters for queries, algonside which queries will be executed when launching the experiments and the VLLM that will be used.
If you want to change the VLLM, modify the "LLM"
key in the json file and enter the name of your VLLM. Additionally, you need to extend the llm_call.py
file with the API provided by your chosen VLLM.
It is imperative that the name given to the LLM field in the config.json
file and the name you use in the llm_call.py
file match.
To run the experiments without any model (thus generating random choices to the model's answers), you can input Test
in the "LLM"
field of the config.json
file.
The Tollbooth video and the Volleyball video can be found at this link
To use them, copy the Tollbooth video in Samsara/topics/running_example/data/cars-me/04/
and Samsara/topics/tollbooth/data/cars-me/04/
.
Copy the Volleyball video in Samsara/topics/volleyball/data/0
.
They are necessary to run the experiments.
The setup was tested on both MacOS, Ubuntu and Arch Linux.
- Install the latest version of
miniconda
(or useAnaconda
). - Clone the repository
cd
into it and run thesetup.sh
script. This will install all the necessary dependencies in anenvironment
directory created in the current one.- run
conda activate samsara
to activate the python environment - run the
run_experiments.sh
script to start executing all the queries defined in theconfig.json
file, one by one.