SampleFactory APPO baseline for Iglu

THE CODE IS IN THE WIP STATUS. PLEASE STAY TUNED FOR THE UPDATES. WE EXPECT TO RELEASE A WORKING BASELINE BY THE END OF THE WARM-UP ROUND.

Idea

Training an agent to build any language-defined structure is a challenging task. To overcome this, we have developed a multitask hierarchical builder (MHB) with three modules: task generator (NLP part), subtask generator (heuristic part), and subtask solving module (RL part). We define the subtask as an episode of adding or removing a single cube. It allows us to train an agent with a dense reward signal in episodes with a short horizon.

Task generator module generate full target(3D voxel) figure using dialogue with person. For training we use randomly generated compact structures as tasks.

Subtask generator receives a 3D voxel as input and outputs a sequence of subgoals (remove or install one cube) in a certain sequence (left to right, bottom to top);

Subtask solving module APPO agent who is learning the task of adding or removing one cube.

Original figure (3D voxel)	Baseline building process

Code structure

Now, function target_to_subtasks in wrappers/target_generator.py implements the main algorithm for splitting the goal into subtasks Also, in wrappers/multitask you can find TargetGenerator and SubtaskGenerator classes. First class make full-figure target using RandomFigure generator or DatasetFigure generator. Second class make subtasks for environment.

Installation

For this baseline version uses branch segments from Iglu gridworld repository. You can install this version by the following command:

pip install git+https://github.com/iglu-contest/gridworld.git@segments

Just install all dependencies using:

pip install -r docker/requirements.txt

Training APPO

Just run train.py with config_path:

python main.py --config_path iglu_baseline.yaml

Enjoy baseline

Run enjoy.py :

python utils/enjoy.py

Per-skill aggregation of the baselines performance metrics.

Instead of evaluating a metric for each structure in the dataset, we evaluate the agent's skills required to build each structure. There are 5 skills in total:

flat - flat structure with all blocks on the ground
flying - there are blocks that cannot be placed without removing some other blocks (i.e. )
diagonal - some blocks are adjacent (in vertical axis) diagonally
tricky - some blocks are hidden or there should be a specific order in which they should be placed
tall - a structure cannot be built without the agent being high enough (the placement radius is 3 blocks)

For each task, we calculate F1 score between built and target structures. For each skill, we average the performance on all targets requiring that skill.

F1 score	flying	tall	diagonal	flat	tricky	all
MHB agent (NLP)	0.292	0.322	0.242	0.334	0.295	0.313
MHB agent (full)	0.233	0.243	0.161	0.290	0.251	0.258
Random agent (full)	0.039	0.036	0.044	0.038	0.043	0.039

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
docker		docker
models		models
train_dir		train_dir
utils		utils
wrappers		wrappers
LICENSE		LICENSE
README.md		README.md
colab_render.py		colab_render.py
colab_render.sh		colab_render.sh
iglu_baseline.yaml		iglu_baseline.yaml
main.py		main.py
run.yaml		run.yaml
training_run.py		training_run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SampleFactory APPO baseline for Iglu

Idea

Code structure

Installation

Training APPO

Enjoy baseline

Per-skill aggregation of the baselines performance metrics.

Results

About

Releases

Packages

Contributors 3

Languages

License

iglu-contest/iglu-2022-rl-baseline

Folders and files

Latest commit

History

Repository files navigation

SampleFactory APPO baseline for Iglu

Idea

Code structure

Installation

Training APPO

Enjoy baseline

Per-skill aggregation of the baselines performance metrics.

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages