Forum Discussion
Ensuring Safe VM Deletion in VMSS: Process Completion Verification Before Scaling Down
Hi everyone, good morning!
I'm working on setting up a logic flow in Power Automate that will delete VMs from my Virtual Machine Scale Set (VMSS) when we have a low number of items to process. I've already managed to create a logic that checks the number of items in my database and, based on that, I can successfully increase or decrease the capacity in my VMSS – this part is working well.
The issue is... when reducing resources, I need to check if the VM I intend to delete is still running any flow (to avoid the risk of deleting it while it’s mid-process).
I think I can determine this by looking at some table in Dataverse or something similar.
Now, my goal is to delete certain VMs. Is it possible to ensure that the VM completes whatever it’s currently executing before deletion?
This way, I could be sure that the VM has finished its task before proceeding with deletion. I know other automation platforms offer options like "Immediate STOP," which stops the VM immediately, and "Request STOP," which essentially means "finish what you're doing, then stop."
This would be when I decrease the number of instances, right?
Do you think I could achieve something like this via Power Automate or Azure?
experi18
Great question! I tried my best to answer. Determining whether a VM in a Virtual Machine Scale Set (VMSS) is "idle" depends on the specific application or workload running on the instance. Here's how you can define and identify an "idle" VM:What Constitutes an "Idle" VM?
An "idle" VM is one that:
1. Isn't actively processing any tasks.
2. Has no critical processes running.
3. Isn't consuming significant resources (CPU, RAM, Disk).
The exact criteria will depend on your application's architecture and workload.Ways to Identify an Idle VM:
1. Application-Level Status
Best Option: If your application has a clear understanding of its workload:
Use the application to log its status (e.g., Processing or Idle) to a centralized location (like Dataverse, Azure Table Storage, or Cosmos DB).
Before scaling down, check this status to ensure only Idle VMs are selected for deletion.
Example:
If the application has a queue system, mark a VM as Idle when:
It has finished processing all queue items.
It has no active tasks in memory.2. Resource Utilization Metrics (CPU/RAM)
Use Azure Monitor or Application Insights to track resource consumption:
Consider a VM "idle" if CPU and RAM utilization fall below a defined threshold (e.g., CPU < 10% for 5 minutes).
Create Azure Monitor alerts or Logic Apps triggers to query these metrics.
Example Logic App/PowerShell Query:
az monitor metrics list --resource <VM_RESOURCE_ID> --metric "Percentage CPU" --interval PT5M
Add a condition in your Logic App to scale down VMs with low utilization.3. Custom Health Checks
Use a custom health probe (via Azure Load Balancer or Application Insights) to periodically check:
Is the application responding?
Is a specific service running or processing requests?
Example with Azure Load Balancer:
Configure the load balancer to check for a "health endpoint" (e.g., /status) exposed by your application.
If the endpoint returns Idle, the VM is eligible for scaling down.4. Process Monitoring
Check for running processes specific to your application. If no critical processes are active, the VM can be considered idle.
Use PowerShell or custom scripts for this.
Example PowerShell Script:
Get-Process -Name "YourAppProcess" | Measure-Object
If no instances of YourAppProcess are running, the VM is idle.Recommended Approach
Application-Level Status (Preferred):
This is the most accurate and reliable method since your app knows best when it's idle.
Update the VM's status in a centralized database (Idle or Processing).Resource Usage (Fallback):
Use Azure Monitor to track metrics like CPU and RAM and define thresholds for idleness.Combine Methods:
If possible, combine application-level signals with resource monitoring for added accuracy.Next Steps for Your Setup
Implement an Application-Level Check:
Add a status update mechanism in your app to log Idle or Processing to a central database.Query VM Status Before Deletion:
Use Azure Logic Apps or Power Automate to query VM statuses before scaling down.Define Clear Thresholds:
Decide on thresholds (e.g., "CPU < 10% for 5 minutes") if relying on resource metrics.Test the Workflow:
Before automating the scale-down process, test the logic manually to ensure no VMs are incorrectly deleted.This combined approach keeps your process simple and avoids prematurely deleting active VMs.
4 Replies
Referring to below:
Step 1: Check VM Status
You can use Azure Monitor and Log Analytics to track the status of your VMs. By setting up custom logs or metrics, you can determine if a VM is currently processing any tasks.
Step 2: Use Azure Automation
Azure Automation can help you manage and automate the lifecycle of your VMs. You can create a runbook that checks the status of the VM before initiating the deletion process. Here's a basic outline of how you can achieve this:
- Create a Runbook: In Azure Automation, create a runbook that checks the status of the VM.
- Check for Active Processes: Use PowerShell or Python scripts within the runbook to query the status of the VM and ensure it is not running any critical processes.
- Delete VM: If the VM is idle, proceed with the deletion. If not, wait and recheck after a specified interval.
Step 3: Integrate with Power Automate
You can trigger the Azure Automation runbook from Power Automate. Here's a high-level overview of the flow:
- Power Automate Flow: Create a flow that triggers based on your logic (e.g., low number of items to process).
- Invoke Runbook: Use the "Create Job" action in Power Automate to start the Azure Automation runbook.
- Monitor and Delete: The runbook will handle the logic to check the VM status and delete it if it's idle.
Example PowerShell Script for Runbook
Here's a simplified example of a PowerShell script that you can use in your Azure Automation runbook:
# Connect to Azure Connect-AzAccount # Get the VM status $vm = Get-AzVM -ResourceGroupName "YourResourceGroup" -Name "YourVMName" $vmStatus = $vm.ProvisioningState # Check if the VM is idle if ($vmStatus -eq "Succeeded") { # Proceed with deletion Remove-AzVM -ResourceGroupName "YourResourceGroup" -Name "YourVMName" -Force } else { # Wait and recheck Start-Sleep -Seconds 30 # Re-run the script or logic to check again- experi18Brass Contributor
Oh, and there is another additional comment to this... these are not regular VMs, they are VMSS (Virtual Machine Scale Sets)
- experi18Brass Contributor
Hi, thank you for your explanation. I understand your point, but I’m still a bit unclear on what exactly constitutes an "idle" VM. How would I identify that?
Is it primarily based on resource usage like RAM or CPU consumption, or perhaps checking if a specific service is running?
I find your idea very interesting, and I’d like to follow your approach since I was initially considering a more complex path.
Thanks a lot for the help!
- balasubramanimIron Contributor
experi18
Great question! I tried my best to answer. Determining whether a VM in a Virtual Machine Scale Set (VMSS) is "idle" depends on the specific application or workload running on the instance. Here's how you can define and identify an "idle" VM:What Constitutes an "Idle" VM?
An "idle" VM is one that:
1. Isn't actively processing any tasks.
2. Has no critical processes running.
3. Isn't consuming significant resources (CPU, RAM, Disk).
The exact criteria will depend on your application's architecture and workload.Ways to Identify an Idle VM:
1. Application-Level Status
Best Option: If your application has a clear understanding of its workload:
Use the application to log its status (e.g., Processing or Idle) to a centralized location (like Dataverse, Azure Table Storage, or Cosmos DB).
Before scaling down, check this status to ensure only Idle VMs are selected for deletion.
Example:
If the application has a queue system, mark a VM as Idle when:
It has finished processing all queue items.
It has no active tasks in memory.2. Resource Utilization Metrics (CPU/RAM)
Use Azure Monitor or Application Insights to track resource consumption:
Consider a VM "idle" if CPU and RAM utilization fall below a defined threshold (e.g., CPU < 10% for 5 minutes).
Create Azure Monitor alerts or Logic Apps triggers to query these metrics.
Example Logic App/PowerShell Query:
az monitor metrics list --resource <VM_RESOURCE_ID> --metric "Percentage CPU" --interval PT5M
Add a condition in your Logic App to scale down VMs with low utilization.3. Custom Health Checks
Use a custom health probe (via Azure Load Balancer or Application Insights) to periodically check:
Is the application responding?
Is a specific service running or processing requests?
Example with Azure Load Balancer:
Configure the load balancer to check for a "health endpoint" (e.g., /status) exposed by your application.
If the endpoint returns Idle, the VM is eligible for scaling down.4. Process Monitoring
Check for running processes specific to your application. If no critical processes are active, the VM can be considered idle.
Use PowerShell or custom scripts for this.
Example PowerShell Script:
Get-Process -Name "YourAppProcess" | Measure-Object
If no instances of YourAppProcess are running, the VM is idle.Recommended Approach
Application-Level Status (Preferred):
This is the most accurate and reliable method since your app knows best when it's idle.
Update the VM's status in a centralized database (Idle or Processing).Resource Usage (Fallback):
Use Azure Monitor to track metrics like CPU and RAM and define thresholds for idleness.Combine Methods:
If possible, combine application-level signals with resource monitoring for added accuracy.Next Steps for Your Setup
Implement an Application-Level Check:
Add a status update mechanism in your app to log Idle or Processing to a central database.Query VM Status Before Deletion:
Use Azure Logic Apps or Power Automate to query VM statuses before scaling down.Define Clear Thresholds:
Decide on thresholds (e.g., "CPU < 10% for 5 minutes") if relying on resource metrics.Test the Workflow:
Before automating the scale-down process, test the logic manually to ensure no VMs are incorrectly deleted.This combined approach keeps your process simple and avoids prematurely deleting active VMs.