Forum Discussion

asorrour's avatar
asorrour
Copper Contributor
Jul 25, 2024

Health state monitoring for a service hosted on Azure Virtual Machine

I have a virtual machine hosted on Azure cloud, I wanted to add some monitoring on the health state of a service deployed on the VM. I already enabled the health monitoring extension to keep pinging the health URL of the service and return 200 or otherwise when the service is down. It shows green "Healthy" or yellow "Unhealthy" on the VM's overview page, which is great.

 

I was expecting to get some data in this Insight log table HealthStateEventChange, when the service is down, but the table is always empty. Anyone who worked with this table or can give any support I would appreciate any.

 

https://learn.microsoft.com/en-us/azure/azure-monitor/reference/tables/HealthStateChangeEvent

 

  • Hi asorrour,

     

    To monitor the health state of a service on an Azure Virtual Machine, you’ve already made good progress by enabling the health monitoring extension and setting up a health endpoint. However, you're not seeing data in the HealthStateEventChange table, which can be frustrating.

    Here are some steps and tips to troubleshoot and resolve this issue:

    1. Verify Extension Configuration: Double-check that the health monitoring extension is correctly configured to log events. Ensure that the health endpoint is correctly specified and that it returns a 200 status when healthy.

    2. Check Log Analytics Workspace: Make sure that your VM is correctly connected to a Log Analytics workspace. The health monitoring extension should send data to this workspace.

    3. Ensure Diagnostic Settings: Verify that the diagnostic settings for your VM include logs for HealthStateEventChange. You might need to configure this explicitly.

    4. Review IAM Permissions: Ensure that the VM has the necessary permissions to write to the Log Analytics workspace. Sometimes, missing permissions can prevent data from being logged.

    5. Examine Logs in Azure Portal: Navigate to the Logs section in your Log Analytics workspace and manually query for HealthStateEventChange data. Use a simple query like:

       

     

    HealthStateEventChange | where TimeGenerated > ago(1d)

     

    • Service-Specific Logs: Sometimes, health state events might be logged under a different table or custom logs, especially if there's custom instrumentation in the application. Check other relevant log tables or custom logs.

    • Update and Test: Make sure your VM and all extensions are updated to the latest version. Also, test the health state change by deliberately stopping the service and checking if the logs capture this event.

    • Enable Detailed Diagnostics: Increase the verbosity of diagnostics for the health monitoring extension if possible. This can provide more insights into what might be going wrong.

    • Azure Monitor Alerts: Set up Azure Monitor alerts based on health state changes. This can act as an additional layer to ensure you get notified of health state changes even if the HealthStateEventChange table remains empty.

    Example setup for Diagnostic Settings:

    1. Go to your VM in the Azure Portal.
    2. Click on "Diagnostic settings" under "Monitoring".
    3. Add a diagnostic setting, and ensure you select logs for both Azure Activity Log and Guest OS metrics.
    4. Route these logs to your Log Analytics workspace.

    If you've followed all these steps and still don't see data, get back to me.

     

    I hope this helps you 🙂

    Mathtias

     

  • asorrour 

     

    Refer on below when HealthStateEventChange empty:

     

    • Resource Configuration: Ensure that the health monitoring extension is correctly configured for the specific service you’re monitoring. Double-check the extension settings and the health URL 
    • Ingestion and Storage: While there’s no direct cost for the guest health feature, there is a cost for ingesting and storing health state data in the Log Analytics workspace
    • Resource Identifier: Sometimes, the data might not contain the associated resource identifier (such as VM availability status or health annotations) in the event metadata

Resources