getting a text notification and also stopping the Azure Compute instance when training job finishes

Copper Contributor

I have spawn a 4 GPU node that is nearly $4 an hour. I want to automate stopping the compute node as well as getting a text message via my phone once the training job either gets an error or is finished. I am running the job natively from within the Azure Compute node using ssh in terminal. What is the solution for this?

 

$ pfetch
_ azureuser@mona
---(_) os Ubuntu 20.04.5 LTS
_/ --- \ host Virtual Machine 7.0
(_) | | kernel 5.15.0-1022-azure
\ --- _/ uptime 12m
---(_) pkgs 1740
memory 5038M / 225656M

0 Replies