Adapted minor spelling mistakes in the ReadMe.md

eb450af7 · Christoph Kowalski · 0c78f48d · eb450af7
Commit eb450af7 authored 4 months ago by Christoph Kowalski
--- a/cooperative_cuisine/reinforcement_learning/README.md
+++ b/cooperative_cuisine/reinforcement_learning/README.md
 # Cooperative Cuisine and Reinforcement Learning

-Cooperative Cuisine can be used to train a reinforcment learning agent. In this implementation [stable_baselines](https://github.com/hill-a/stable-baselines) is used to load the rl algorithm 
+Cooperative Cuisine can be used to train a reinforcment learning agent. In this implementation, [stable_baselines](https://github.com/hill-a/stable-baselines) rl algorithm are used.

 <p align="center">
  <img src="./data/tomato_soup_fixed_small_env_third.gif" width="12%" margin-right= "1%;" />
@@ -48,7 +48,7 @@ The layout files for the project are stored in the `cooperative_cuisine/config`

 ### Using Overcooked-AI Levels and Configs in Cooperative Cuisine

-All layouts from **Overcooked-AI** can be used within Cooperative Cuisine. Dedicated configs are defined and can be loaded via Hydra. To use Overcooked-AI layouts:
+All layouts from [**Overcooked-AI**](https://github.com/HumanCompatibleAI/overcooked_ai) can be used within Cooperative Cuisine. Dedicated configs are defined and can be loaded via Hydra. To use Overcooked-AI layouts:

 1. Set the [`overcooked-ai_environment_config.yaml`](./config/environment/overcooked-ai_environment_config.yaml) as the environment config.
 2. Define any layout from Overcooked-AI under `layout_name`.
@@ -101,7 +101,7 @@ The cutting board presents a major challenge for the agent, especially when mult
 PPO can be unstable, showing good progress and then plateauing.  A recommended game time limit is between `150-300` seconds, depending on the complexity of the task. For faster training, a lower time limit can be effective.

 #### Recommended PPO Hyperparameters:
- **Ent_coef:** 0 and 0.01 to aid exploration.
+- **Entropy Coefficient (ent_coef):** 0 and 0.01 to aid exploration.
 - **Batch size:** 256
 - **Number of environments (n_envs):** 32
 - **Learning rate:** 0.0006
@@ -115,7 +115,7 @@ The number of timesteps varies significantly based on the task's complexity (e.g

 <p align="center">
  <img src="./data/onion_soup_centre_pots_fixed_env_overcooked-ai_row.gif" width="100%"  />
-Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout
+Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout.
 </p>

 <br/>
@@ -124,7 +124,7 @@ Preparing onion soup in the overcooked-ai centre-pots environment with a fixed c

 <p align="center">
  <img src="./data/onion_soup_centre_pots_fixed_env_overcooked-ai_with_cutting_row.gif" width="100%"  />
-Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout and added cutting board 
+Preparing onion soup in the overcooked-ai centre-pots environment with a fixed counter layout and added a cutting board. 
 </p>

 <br/>
@@ -133,7 +133,7 @@ Preparing onion soup in the overcooked-ai centre-pots environment with a fixed c

 <p align="center">
  <img src="./data/onion_soup_large_env_overcooked-ai_row.gif" width="100%"  />
-Preparing onion soup in the overcooked-ai large environment with a fixed counter layout
+Preparing onion soup in the overcooked-ai large environment with a fixed counter layout.
 </p>

 <br/>
@@ -142,7 +142,7 @@ Preparing onion soup in the overcooked-ai large environment with a fixed counter

 <p align="center">
  <img src="./data/onion_soup_large_random_env_overcooked-ai_row.gif" width="100%"  />
-Preparing onion soup in the overcooked-ai large environment with a random counter layout 
+Preparing onion soup in the overcooked-ai large environment with a random counter layout.
 </p>

 <br/>
@@ -152,7 +152,7 @@ Preparing onion soup in the overcooked-ai large environment with a random counte
 <p align="center">
  <img src="./data/tomato_soup_fixed_small_env_row1.gif" width="49.5%"  />
  <img src="./data/tomato_soup_fixed_small_env_row2.gif" width="49.5%" />
-   Preparing a tomato soup in the cooperative cuisine environment with a fixed counter layout
+   Preparing a tomato soup in the cooperative cuisine environment with a fixed counter layout.
 </p>

 <br/>
@@ -160,14 +160,14 @@ Preparing onion soup in the overcooked-ai large environment with a random counte

 <p align="center">
  <img src="./data/tomato_soup_small_random_env_row.gif" width="120%"  />
-Preparing a tomato soup in the cooperative cuisine environment with a random counter layout 
+Preparing a tomato soup in the cooperative cuisine environment with a random counter layout.
 </p>
 <br/>
 <br/>

 <p align="center">
  <img src="./data/salad_fixed_small_env_row.gif" width="120%"  />
-Preparing a salad in the cooperative cuisine environment with a random counter layout 
+Preparing a salad in the cooperative cuisine environment with a random counter layout. 
 </p>