π Introduction
Have you ever written a perfectly valid KQL query against the AzureDiagnostics table β only to find that one or more columns return empty results or don't exist at all? You double-check the column name, verify the resource filter, and everything looks right β yet the data simply isn't there.
This is one of the more silent and confusing behaviors of the AzureDiagnostics table in Azure Monitor Log Analytics, and it doesn't just affect one specific column β it can affect any column, including commonly queried ones like data_sensitivity_information_s, deadlock_xml_s, query_hash_s, error_message_s, and others.
In this post, I'll explain exactly why this happens, how to detect it, how to write resilient KQL queries that always retrieve your data.
π The Scenario: Two Queries, Two Different Outcomes
Here's a real-world example from a customer case involving Azure SQL audit logs and data sensitivity classification.
Query 1 β The "Non-Working" Query:
AzureDiagnostics
| where Category == 'SQLSecurityAuditEvents'
| where ResourceId == '/SUBSCRIPTIONS/<SubscriptionID>/RESOURCEGROUPS/<RG>/PROVIDERS/MICROSOFT.SQL/SERVERS/<ServerName>/DATABASES/<DBName>'
| project
event_time_t,
statement_s,
succeeded_s,
affected_rows_d,
server_principal_name_s,
client_ip_s,
application_name_s,
additional_information_s,
data_sensitivity_information_s
| order by event_time_t desc
| take 100
data_sensitivity_information_s returns empty β or the column doesn't exist at all in the schema.
Not only restricted to one column as you see. Another column "affected_rows_d" failed to resolve.
Query 2 β The "Working" Query:
AzureDiagnostics
| where Category == "SQLSecurityAuditEvents"
| where ResourceId == "/SUBSCRIPTIONS/<SubscriptionID>/RESOURCEGROUPS/<RG>/PROVIDERS/MICROSOFT.SQL/SERVERS/<ServerName>/DATABASES/<DBName>"
| extend DataSensitivityInfo = tostring(parse_json(AdditionalFields).data_sensitivity_information)
| where isnotempty(DataSensitivityInfo)
| project event_time_t, DataSensitivityInfo
| take 100
This one works perfectly. The difference? It reads from AdditionalFields instead of directly from the column.
π‘ Important: This is NOT a bug with data_sensitivity_information_s specifically. This behavior can happen to any column in the AzureDiagnostics table β and understanding why is key to writing reliable audit queries.
π§© Root Cause: The AzureDiagnostics 500-Column Limit β A Default Platform Behavior
The AzureDiagnostics table in Azure Monitor Log Analytics is a shared, multi-resource table. It collects diagnostic data from every Azure resource type that sends logs to your workspace β Azure SQL, Key Vault, App Service, Storage Accounts, Firewalls, and many more. Each resource type contributes its own set of columns, all landing in this single table.
Because of this design, the AzureDiagnostics table enforces a hard limit of 500 columns per workspace. This is a default, platform-wide behavior β not specific to any one Azure service or column and is designed to avoid any data loss due to the number of active columns exceeding this 500 column limit.
Here's what happens when that limit is reached:
| Workspace Column Count | Behavior |
|---|---|
| < 500 columns | New columns are created normally (e.g., data_sensitivity_information_s) |
| β₯ 500 columns | Any new or overflow column data is silently redirected into a dynamic JSON property bag called AdditionalFields |
β οΈ This redirection is completely silent. No error is thrown. No warning appears in your query results. The data is NOT lost β it's simply stored differently, inside AdditionalFields as a JSON key-value pair.
This means that in a busy Log Analytics workspace shared across many Azure resource types, virtually any column that exceeds the 500-column threshold will have its data moved to AdditionalFields β whether it's data_sensitivity_information_s, deadlock_xml_s, query_hash_s, error_message_s, or any other field your resource type emits.
β οΈ "Because AzureDiagnostics is a single shared table, all Azure resource types sending logs to your workspace β Azure Firewall, Key Vault, App Service, AKS, and others β contribute their columns to the same 500-column pool. As Microsoft's documentation notes, this is precisely why AzureDiagnostics is 'much more susceptible to exceeding the 500-column limit' compared to other tables β your SQL columns can spill even if your SQL workload alone would never breach the limit."
π Step 1: Diagnose β Has Your Workspace Hit the 500-Column Limit?
Run this query in your Log Analytics workspace to check the current column count:
// Check total column count in AzureDiagnostics
AzureDiagnostics
| getschema
| summarize
TotalColumns = count(),
RemainingCapacity = 500 - count(),
LimitReached = iff(count() >= 500, "β οΈ YES - Data is going to AdditionalFields", "β
NO - Under the limit")
Then check whether a specific column you're looking for exists in the schema at all:
// Replace the your_column_name with any column you're investigating
AzureDiagnostics
| getschema
| where ColumnName == "your_column_name"
If this returns no rows, that column's data has been redirected to AdditionalFields. The same check applies to any column you suspect is missing.
Use this query to discover all fields currently spilled to AdditionalFields for SQL audit events:
AzureDiagnostics
| where Category == "SQLSecurityAuditEvents"
| where isnotempty(AdditionalFields)
| extend ParsedAF = parse_json(AdditionalFields)
| extend AF_Keys = bag_keys(ParsedAF)
| mv-expand AF_Keys to typeof(string)
| summarize SpilledColumns = make_set(AF_Keys), Count = count() by Category
| project Category, Count, SpilledColumns
β Step 2: Fix β Use AdditionalFields to Retrieve Any Spilled Column
Once you've confirmed the spill, update your queries to extract the missing field from AdditionalFields using parse_json. This approach works for any column that has been redirected.
Recommended Pattern β Resilient to Both States:
AzureDiagnostics
| where Category == "SQLSecurityAuditEvents"
| extend
DataSensitivityInfo = coalesce(data_sensitivity_information_s, tostring(parse_json(AdditionalFields).data_sensitivity_information)),
QueryHash = coalesce(query_hash_s, tostring(parse_json(AdditionalFields).query_hash)),
AdditionalInfo = coalesce(additional_information_s, tostring(parse_json(AdditionalFields).additional_information))
| project
event_time_t,
statement_s,
succeeded_s,
server_principal_name_s,
client_ip_s,
application_name_s,
DataSensitivityInfo,
QueryHash,
AdditionalInfo
| order by event_time_t desc
| take 100
A shorter version:
AzureDiagnostics
| where Category == "SQLSecurityAuditEvents"
| where ResourceId == "/SUBSCRIPTIONS/<SubscriptionID>/RESOURCEGROUPS/<RG>/PROVIDERS/MICROSOFT.SQL/SERVERS/<ServerName>/DATABASES/<DBName>"
| extend DataSensitivityInfo = tostring(parse_json(AdditionalFields).data_sensitivity_information)
| where isnotempty(DataSensitivityInfo)
| project event_time_t, DataSensitivityInfo
| take 100
π‘ coalesce() tries the direct column first. If null, it falls back to AdditionalFields. This makes your query portable across workspaces β whether at the limit or not.
π References
- Azure Monitor Logs β AzureDiagnostics table reference
- SQLSecurityAuditEvents table reference
- Analyze Azure SQL audit logs in Log Analytics
- Resource logs in Azure Monitor - Azure Monitor | Microsoft Learn