Skip to content
Snippets Groups Projects
Commit eb450af7 authored by Christoph Kowalski's avatar Christoph Kowalski
Browse files

Adapted minor spelling mistakes in the ReadMe.md

parent 0c78f48d
No related branches found
No related tags found
2 merge requests!110V1.2.0 changes,!109SB3 RL with Hydra
Pipeline #64186 passed
# Cooperative Cuisine and Reinforcement Learning
Cooperative Cuisine can be used to train a reinforcment learning agent. In this implementation [stable_baselines](https://github.com/hill-a/stable-baselines) is used to load the rl algorithm
Cooperative Cuisine can be used to train a reinforcment learning agent. In this implementation, [stable_baselines](https://github.com/hill-a/stable-baselines) rl algorithm are used.
<p align="center">
<img src="./data/tomato_soup_fixed_small_env_third.gif" width="12%" margin-right= "1%;" />
......@@ -48,7 +48,7 @@ The layout files for the project are stored in the `cooperative_cuisine/config`
### Using Overcooked-AI Levels and Configs in Cooperative Cuisine
All layouts from **Overcooked-AI** can be used within Cooperative Cuisine. Dedicated configs are defined and can be loaded via Hydra. To use Overcooked-AI layouts:
All layouts from [**Overcooked-AI**](https://github.com/HumanCompatibleAI/overcooked_ai) can be used within Cooperative Cuisine. Dedicated configs are defined and can be loaded via Hydra. To use Overcooked-AI layouts:
1. Set the [`overcooked-ai_environment_config.yaml`](./config/environment/overcooked-ai_environment_config.yaml) as the environment config.
2. Define any layout from Overcooked-AI under `layout_name`.
......@@ -101,7 +101,7 @@ The cutting board presents a major challenge for the agent, especially when mult
PPO can be unstable, showing good progress and then plateauing. A recommended game time limit is between `150-300` seconds, depending on the complexity of the task. For faster training, a lower time limit can be effective.
#### Recommended PPO Hyperparameters:
- **Ent_coef:** 0 and 0.01 to aid exploration.
- **Entropy Coefficient (ent_coef):** 0 and 0.01 to aid exploration.
- **Batch size:** 256
- **Number of environments (n_envs):** 32
- **Learning rate:** 0.0006
......@@ -115,7 +115,7 @@ The number of timesteps varies significantly based on the task's complexity (e.g
<p align="center">
<img src="./data/onion_soup_centre_pots_fixed_env_overcooked-ai_row.gif" width="100%" />
Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout
Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout.
</p>
<br/>
......@@ -124,7 +124,7 @@ Preparing onion soup in the overcooked-ai centre-pots environment with a fixed c
<p align="center">
<img src="./data/onion_soup_centre_pots_fixed_env_overcooked-ai_with_cutting_row.gif" width="100%" />
Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout and added cutting board
Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout and added a cutting board.
</p>
<br/>
......@@ -133,7 +133,7 @@ Preparing onion soup in the overcooked-ai centre-pots environment with a fixed c
<p align="center">
<img src="./data/onion_soup_large_env_overcooked-ai_row.gif" width="100%" />
Preparing onion soup in the overcooked-ai large environment with a fixed counter layout
Preparing onion soup in the overcooked-ai large environment with a fixed counter layout.
</p>
<br/>
......@@ -142,7 +142,7 @@ Preparing onion soup in the overcooked-ai large environment with a fixed counter
<p align="center">
<img src="./data/onion_soup_large_random_env_overcooked-ai_row.gif" width="100%" />
Preparing onion soup in the overcooked-ai large environment with a random counter layout
Preparing onion soup in the overcooked-ai large environment with a random counter layout.
</p>
<br/>
......@@ -152,7 +152,7 @@ Preparing onion soup in the overcooked-ai large environment with a random counte
<p align="center">
<img src="./data/tomato_soup_fixed_small_env_row1.gif" width="49.5%" />
<img src="./data/tomato_soup_fixed_small_env_row2.gif" width="49.5%" />
Preparing a tomato soup in the cooperative cuisine environment with a fixed counter layout
Preparing a tomato soup in the cooperative cuisine environment with a fixed counter layout.
</p>
<br/>
......@@ -160,14 +160,14 @@ Preparing onion soup in the overcooked-ai large environment with a random counte
<p align="center">
<img src="./data/tomato_soup_small_random_env_row.gif" width="120%" />
Preparing a tomato soup in the cooperative cuisine environment with a random counter layout
Preparing a tomato soup in the cooperative cuisine environment with a random counter layout.
</p>
<br/>
<br/>
<p align="center">
<img src="./data/salad_fixed_small_env_row.gif" width="120%" />
Preparing a salad in the cooperative cuisine environment with a random counter layout
Preparing a salad in the cooperative cuisine environment with a random counter layout.
</p>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment