SocialAI

title	emoji	colorFrom	colorTo	sdk	app_port
SocialAI School Demo	🧙🏻‍♂️	gray	indigo	docker	7860

SocialAI

This repository is the official implementation of SocialAI: Benchmarking Socio-Cognitive Abilities inDeep Reinforcement Learning Agents.

The website of the project is here

The code is based on: minigrid

Additional repositories used: BabyAI RIDE astar

Installation

Create and activate your conda env

conda create --name social_ai python=3.7
conda activate social_ai
conda install -c anaconda graphviz

Install the required packages

pip install -r requirements.txt
pip install -e torch-ac
pip install -e gym-minigrid 
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia

Jupyter Notebook

Install the jupyter:

pip install jupyter

Start the jupyter notebook with examples of usage with:

jupyter notebook SocialAI_playground.ipynb

You can also play with our google colab notebook

Interactive policy

To run an enviroment in the interactive mode run:

python -m scripts.manual_control.py

You can test different enviroments with the --env parameter.

Interactive demo

You can test our interactive hugginface spaces demo

There you can create different enviroments and control the agent inside them.

RL experiments

Training

Minimal example

To train a policy, run:

python -m scripts.train --model test_model_name/1 --seed 1  --compact-save --algo ppo --env SocialAI-AsocialBoxInformationSeekingParamEnv-v1 --dialogue --save-interval 1 --log-interval 1 --frames 5000000 --multi-modal-babyai11-agent --arch original_endpool_res --custom-ppo-2

The policy should be above 0.95 success rate after the first 2M environment interactions.

To plot the curve run:

python data_visualize.py test_model_name

To visualize the policy, run:

python -m scripts.visualize --model storage/test_model_name/1/ --pause 0.1 --seed $RANDOM --episodes 20 --gif viz/test --env-name SocialAI-AsocialBoxInformationSeekingParamEnv-v1 ```

To evaluate a on a different environment, run:

python -m scripts.evaluate_new --episodes 500  --test-set-seed 1  --model-label test_model --eval-env SocialAI-TestLanguageFeedbackSwitchesInformationSeekingParamEnv-v1  --model-to-evaluate storage/test/ --n-seeds 8

Recreating all the experiments

Regular machine

To run the experiments on a regular machine run_SAI_final_case_studies.txt contains all the bash commands to run the RL experiments.

Slurm based cluster (todo:)

To recreate all the experiments from the paper on a slurm based server configure the campaign_launcher.py script and run:

python campaign_launcher.py run_SAI_final_case_studies.txt

LLM experiments

For LLMs set your OPENAI_API_KEY (and HF_TOKEN) variable in ~/.bashrc or wherever you want.

Creating in-context examples

To create in_context examples you can use the create_LLM_examples.py script.

This script will open an interactive window, where you can manually control the agent. By default, nothing is saved. The general procedure is to press 'enter' to skip over environments which you don't like. When you see a wanted enviroment, move the agent in the wanted position and start recording (press 'r'). The current and the following steps in the episode will be recorded. Then control the agent and finish the episode. The new episode will start and recording will be turned off again.

If you already like some of the previously collected examples and want to append to them you can use the --load argument.

Evaluating LLM-based agents

The script eval_LLMs.sh contains the bash commands to run all the experiments in the paper.

Here is an example of running evaluation on the text-ada-001 model on the AsocialBox environment:

python -m scripts.LLM_test  --episodes 10 --max-steps 15 --model text-ada-001 --env-args size 7 --env-name SocialAI-AsocialBoxInformationSeekingParamEnv-v1 --in-context-path llm_data/in_context_examples/in_context_asocialbox_SocialAI-AsocialBoxInformationSeekingParamEnv-v1_2023_07_19_19_28_48/episodes.pkl

If you want to control the agent yourself you can set the model to interactive. dummy agent just executes the move forward action, and random executes a random action. These agent are usefull for testing.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README-rsrc		README-rsrc
backup		backup
gym-minigrid		gym-minigrid
llm_data		llm_data
models		models
scripts		scripts
textworld_utils		textworld_utils
torch-ac		torch-ac
utils		utils
web_demo		web_demo
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
SAI_final_case_studies_visualize.py		SAI_final_case_studies_visualize.py
SocialAI_playground.ipynb		SocialAI_playground.ipynb
autocrop.sh		autocrop.sh
campain_continuer.py		campain_continuer.py
campain_launcher.py		campain_launcher.py
data_visualize.py		data_visualize.py
display_LLM_evaluations.py		display_LLM_evaluations.py
draw_tree.py		draw_tree.py
draw_trees.sh		draw_trees.sh
dummy_run.sh		dummy_run.sh
eval_LLMs.sh		eval_LLMs.sh
gpuh.py		gpuh.py
hp_tuning_agent.txt		hp_tuning_agent.txt
n_tokens.py		n_tokens.py
param_tree_demo.py		param_tree_demo.py
requirements.txt		requirements.txt
run.txt		run.txt
run_SAI_final_case_studies.sh		run_SAI_final_case_studies.sh
run_SAI_final_case_studies.txt		run_SAI_final_case_studies.txt
run_soc_inf_gs.txt		run_soc_inf_gs.txt
run_test_rnd_ride.txt		run_test_rnd_ride.txt
stester.py		stester.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SocialAI

Installation

Jupyter Notebook

Interactive policy

Interactive demo

RL experiments

Training

Minimal example

Recreating all the experiments

Regular machine

Slurm based cluster (todo:)

LLM experiments

Creating in-context examples

Evaluating LLM-based agents

About

Releases

Packages

Contributors 3

Languages

License

flowersteam/social-ai

Folders and files

Latest commit

History

Repository files navigation

SocialAI

Installation

Jupyter Notebook

Interactive policy

Interactive demo

RL experiments

Training

Minimal example

Recreating all the experiments

Regular machine

Slurm based cluster (todo:)

LLM experiments

Creating in-context examples

Evaluating LLM-based agents

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages