Mar 10 2022 11:16 PM - edited Mar 10 2022 11:50 PM
Hello,
While training a brain, I keep getting warnings about some states. The warnings suggest that -if they persist- I should consider avoiding the states causing these warnings. The common thing among these states is that they occur at the end of my simulation run. Accordingly, I added an Avoid objective that should avoid the end of the simulation. After that, I restarted the training, the warnings disappeared, yet the avoid objective success rate is always zero. The goal satisfaction is around 80%.
Assuming one run of my simulation takes 720 minutes, then:
avoid endSimulation: state.currentTime in Goal.RangeAbove(720)
where the currentTime is the model time at that time step.
What could be the cause of the avoid objective not succeeding?
Mar 14 2022 11:23 AM
Hello @Duoaa. An avoid objective teaches a brain to prevent something from happening. It tells the brain to avoid certain state conditions. If the conditions occur, the episode will end and be considered a failure. I don't think that's what you want in this case.
If you need to limit the length of an episode, then you can use EpisodeIterationLimit in the training parameters as described here: https://docs.microsoft.com/en-us/bonsai/inkling/keywords/curriculum#curriculum-training-parameters.
Alternatively, you may want to investigate the warnings about the states. One way that this can occur if the numeric values of the simulated states are outside the range declared in the Inkling file's state structure. For example, if a variable is declared as "position: number<-10 .. 10>" but it is assigned the value 11. In order to solve this, you can expand the range or--for state values--declare it as "position: number" so that there is no limit to the values.
Mar 23 2022 07:09 AM
Hello @Forrest_Trepte, thank you for your response.
I have double checked that the numeric values generated are within the ranges specified. Based on my simulation, I have also reduced the EpisodeIterationLimit to the maximum number of iterations possible per episode. For example, if my simulation runs for 720 minutes, and every 5 minutes, if my agent is ideal, I ask the brain for an action, the maximum number of iterations would be (720/5 = 144). My agent is not always idle every 5 minutes, so I can not really anticipate the number of iterations for every simulation run, and thus the warning persists. Would it be safe to ignore the warnings?
This is an example of the warning:
The simulator with id ##### halted execution before the episode was complete.
This indicates a bug in your simulator or Inkling code. It can result in poor training performance if it occurs frequently.
If the system should learn to avoid this state, add an avoid objective or terminal condition for this case.
Episode id: #####
State: { currentTime: 720.0, ... }
Action: { actionToTake: 6.0 }
Mar 23 2022 12:11 PM
Mar 23 2022 12:23 PM
Mar 23 2022 12:36 PM