Aug 10 2020 12:01 PM
I have noticed that the number of the started simulators doesn't correspond to the number I selected when adding my custom simulator. The containers are created and the simulators are started, but the Bonsai service is not trying to connect to them. This started to happen in the last few days, was there any change on the side of the service that would cause that, or the problem is with our simulator?
Thanks,
Goran
Aug 10 2020 12:21 PM
Hi @goran-j , thanks for posting the question.
The number of simulator instances that you specify is a maximum number. The Bonsai platform dynamically adjusts the instance count depending on many factors including things like the chosen RL algorithm, batch size, average episode length, etc. You will likely even see the number of sim instances decrease throughout the training session as average episode lengths increase.
The goal of the platform to minimize training time without wasting compute. Some RL algorithms benefit from additional data parallelism, but all will eventually reach diminishing returns as you add more instances.
Does that explain the behavior you're seeing?
Aug 10 2020 01:42 PM
Thanks @erictr,
That makes sense, we are changing the batch size on the PPO algorithm so that might be the reason,
but I would expect Bonsai to stop the containers it is not using for a while (or not start them at all).
Goran