Restructure Reinforcement Learning files

Following the the todos from Miro: File Structure Logging Hydra Vector Observation

the structure should be changed to enable easy reinforcement learning and hyperparameter tuning Additionally, Hydra should be used to manage all configs relating to reinforcement learning, all other configs should remain untouched.