Simulator package - Training - stuck on 'Connecting simulators'

Occasional Contributor

I've added a simulator package and started Train but the training never moves off 'Connecting simulators'  In the list of Simulators, my package called 'block' shows and when I start training, 5 more simulators are listed, all called 'block Unmanaged' So, instances are starting, but are flagged as 'unmanaged'?  In Azure I can see are 2 separate Azure 'Container instances' created, one with 1 container and the other with 4 containers.  I've looked at the logs of the instances and they all say 'Registered' and 'Idle' (see image attached).  So, for some reason the training doesn't seem to be able to connect to the instances.  Any advice appreciated.

 

Prior to pushing the docker image I've been able to use this image as a container and successfully train the brain using this unmanaged simulator.  The image is python:3.9-slim with the bonsai api and common added.  The Workspace and Access Key are currently set in code.

 

Attached screenshots...

 

Thanks

 

Screenshot 2020-11-20 140355.jpg

Screenshot 2020-11-20 142448.jpg

   

4 Replies

Hi @Cliff_Evans can we please have some more information. i.e workspace id, brain name, brain version information. We would need this information to look up for the simulators which are causing an issue for you. 

Hi @shivanshi thank you for your response.

 

Is still have the same problem.  To try and debug the problem I have uniquely renamed the various pieces from my original question.  The new names are shown below and new screenshots attached.

 

To recap - the issue is that I have a brain (Block) that when I attempt to Train using the managed package (BlockPackage) it unexpectedly creates multiple unmanaged simulators called 'block' and is then stuck 'Connecting simulators' and does not proceed.  I have nothing in the workspace (or repository) called 'block', but I did previously create an unmanaged simulator called 'block' when I was testing my simulator code locally - btw this testing worked and it successfully trained the brain.  It's as though the training is stuck using this unmanaged approach.  In my Inkling I have tried it with "package 'BlockPackage'" and without a package command such that I 'Select a simulator for training' - both routes produce the same issue.  Also, bizarrely the training creates two sets of instances, one with 1 container, and one with 4 containers.  When I look at the logs of any of the containers, the code is running and saying it is 'Registered' and 'Idle'.

 

My apologies in advance if I'm doing something wrong.

 

I suspect & hope, that if I delete everything and just start again it might be ok.  But I'll leave it as is for now so you can take a look.

 

Many thanks

Cliff

 

WorkspaceId: ef6cc48c-9b24-452e-9a9f-8abba5ae8d8b

Brain name: Block

Brain version: v01

Managed simulator: BlockPackage

Container registry repository: blockimage

Teach inkling for simulator:

            source simulator BlockSim(Action: SimAction, Config: SimConfig): SimState {
                # Automatically launch the simulator with this
                # registered package name.
                package "BlockPackage"
            }

 

 

Hi @shivanshi

 

I've solved the problem.  The error was mine.

 

With the intention of creating a simple simulator I used a variation of the example 'bare bones' simulator code on here https://github.com/microsoft/bonsai-common

 

But I needed to add simulator_context as shown below...
 
def get_interface(self) -> SimulatorInterface:
    # Called to retrieve the simulator interface during registration
    face = SimulatorInterface(
        name="block",
        timeout=60,
        simulator_context=config_client.simulator_context,
        )
    return face
 
Without it, the Azure/Bonsai environmental variable SIM_CONTEXT was not being read.  As this variable sets the deploymentMode and containerGroupName etc, it meant my simulator just ran in Azure as an unmanaged simulator instead.
 
Using the 'bare bones' approach I did manage to create a simulator with very few lines of code, but if I'd have build my simulator from the samples
I would have avoided the problem.  
 
Thank you.
Cliff
 

 

 

@Cliff_Evans Great! good to see, you got it working.