MGPO

This repository contains data, analysis code, and the MGPO algorithms from the "Leveraging AI to improve human planning in large partially observable environments" article (under review).

Structure

The resultsdata from the simulation and human experiments, the pre-generated environment instances, and figures used in the article can be found in the data folder.
The experiment folder contains the code required to run the human experiment.
The src folder contains the implementation of the meta-level MDP and the MGPO and baseline algorithms.
The top level folder contains additional notebooks used to analyze the experiment data.

Algorithm

The main algorithmic contributions can be found in two files. The meta-MDP is defined in src/utils/Mouselab_PAR.py, which contains the belief state update in partially observable environments. The MGPO algorithm itself can be found in scr/po_BMPS.py.

Simulation

To rerun the simulation experiment, run the following scripts. The evaluation will take a long time to run, it is recommended to split the computations in small chunks and use a computing cluster. Results are stored under data/simulation_results/.

Data analysis and statistical analysis can be found in the files:

simulation_data_analysis.ipynb
simulation_analysis.R

Meta-greedy baseline policy

To reproduce the evaluation results, run the script for 5000 steps with each of the 4 evaluation environments ["2_36", "3_54", "4_72", "5_90"] and cost parameters [0.05, 1]. Example use:

python -m src.dp_baseline 0.005 0.05 5000 1 4 2_36 200 0

PO-UCT baseline policy

To reproduce the evaluation results, run the script for 5000 steps for all combinations of environments (["2_36", "3_54", "4_72", "5_90"]), cost parameter ([0.05, 1]), and PO-UCT budget ([10, 100, 1000, 5000]). Hyperparameters for each parameter combination can be found in data/simulation_results/pouct_hyperparameters.csv. Example use:

python -m src.pouct_baseline 0.005 0.05 1 100 3 100 1 4 2_36 200 0

MGPO policy

To reproduce the evaluation results, run the script for 5000 steps for all combinations of cost ([0.05, 1]) and environments ([2, 3, 4, 5]). Example use:

python -m src.myopic_cluster_eval 2 0.05 0 1

Human experiment data and analysis

The results of the human experiment can be found in the following files:

data
└───tutor_experiment
│   │   questionnaire.csv
│   │   tutor_experiment_exclusion_data.csv
|   |   tutor_experiment_full_data.csv

to generate the result files from the raw data retrieved from the database the following script was used:

python -m src.tutor_experiment_analysis

Data analysis and statistical analysis can be found in the files:

experiment_analysis.ipynb
experiment_analysis.R

The experiment's preregistration can be found under https://aspredicted.org/RL3_YDD.

Human experiment

The experiment is a heavily adapted version of Fred Callaway's Mouselab-MDP.

To run the experiment locally, run the commands in the command line and open http://localhost:8000/ in a webbrowser.

cd experiment
python -m http.server

The environment instances used in the environment are stored under data/environments.

We used Heroku to host the experiment and Prolific to recruit participants. Before deploying the experiment, it is important to comment out line 60-61 in experiment/static/js/experiment.js since a balanced condition assignment will be handled through PsiTurk.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
experiment		experiment
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
env_graphics.ipynb		env_graphics.ipynb
env_structure_creation.ipynb		env_structure_creation.ipynb
environment.yml		environment.yml
experiment_analysis.R		experiment_analysis.R
experiment_analysis.ipynb		experiment_analysis.ipynb
po_experiment_main.ipynb		po_experiment_main.ipynb
pouct_optimization_analysis.ipynb		pouct_optimization_analysis.ipynb
simulation_analysis.R		simulation_analysis.R
simulation_data_analysis.ipynb		simulation_data_analysis.ipynb
strategy_analysis.ipynb		strategy_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MGPO

Structure

Algorithm

Simulation

Meta-greedy baseline policy

PO-UCT baseline policy

MGPO policy

Human experiment data and analysis

Human experiment

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MGPO

Structure

Algorithm

Simulation

Meta-greedy baseline policy

PO-UCT baseline policy

MGPO policy

Human experiment data and analysis

Human experiment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages