Blog Post

Azure Data Explorer Blog
6 MIN READ

Scale your Azure Firewall monitoring with Azure Data Explorer

GuillaumeBeaud's avatar
Sep 05, 2022

In this blog post, we’ll explore how Azure Data Explorer (ADX) can store and query logs from Azure Firewall and other similar sources. The information is based on a recent implementation at a leading global manufacturing company that is using Azure Sentinel, Azure Log Analytics and ADX to store and process large volumes of Azure Firewall logs cost effectively.

 

Motivation

 

As your cloud footprint grows, the need to monitor traffic among Azure services and VMs with cyber security lens becomes imperative. Companies must do this while also keeping costs under control and supporting the business' growth needs. Azure Firewall logs, as well as other logs, can quickly pile-up and generate significant ingestion, storage, and query costs. These may be considered cash pits as most logs do not generate direct value. Azure Monitor Log Analytics and Microsoft Sentinel are Azure’s built-in log management and security monitoring solution, which are built on Azure Data Explorer; this makes Azure Data Explorer a great companion solution for customers looking for long term, large scale, cost effective log storage and analytics solution.

 

ADX persists data in hot tier blob storage, supports Kusto Query Language (KQL) and can connect to various sources including Event Hub, Event Grid (Blob storage), IoT Hub or Logstash. Azure Sentinel and Log Analytics are built as SaaS like solutions featuring automated data engineering activities which provides out of the box queries and parsers that make logs easily available for querying. When choosing ADX as a companion solution, customers will need to invest in a small amount of data engineering efforts to parse the logs and get these ready for querying for your network and security monitoring. Customers will also need to maintain these connectors in the future, since any change in the log formats will require a rework of the connectors.

 

Log Analytics Workspace (LAW) is a prerequisite to leverage Microsoft Sentinel, Azure’s SIEM/SOAR solution, which may position LAW as the preferred logs management solution for Azure Firewall logs. However, a second look at Azure Firewall logs reveals that some log categories present a security interest, while others are purely operational:

 

 

Depending on your requirements, “AZFWThreatIntel” and “AZFWIdpsSignature” can be defined as security-relevant logs to be sent to Sentinel, while other log types are operational, directed to ADX. Please note: Microsoft Sentinel is a cloud native SIEM solution that generates security insights and alerts based on multiple types of logs, not only those qualified as security-relevant. Therefore, reducing the set of logs in Sentinel may reduce the insights it generates. You may choose different ways of splitting logs based on your needs.

 

For a description of Azure Firewall log categories, please consult:

The most effective cost reduction strategy is to send security-relevant logs to LAW/Sentinel, and operations-relevant logs to ADX. Please note: operational logs represent the vast majority of the logs’ volume, especially the type “AzureFirewallNetworkRule.” As such, this strategy provides a serious cost reduction potential.

 

Target Architecture

 

Proposed solution: Firewall logs are split between Log Analytics (security) and ADX (operational)

 

 

 

The proposed solution is to split logs using two different diagnostic settings at the source (e.g. Azure Firewall), and send operational logs to ADX via Event Hub. This architecture differs from the one documented in Azure Log Analytics Log Management using Azure Data Explorer - Microsoft Tech Community, as it:

  • Splits logs at the source
  • Considers each data pipeline independently
  • Uses ADX as a full-fledged replacement to Log Analytics for a subset of logs, not only as an archiving solution

Additionally, data can be mutually cross-queried between ADX and Log Analytics.

 

Costs of Proposed ADX Solution

 

The client is set to ingest 1 TB of logs daily (roughly 35 million events per month) in the North Europe region, which generates four main costs:

 

Item

Projected monthly cost (USD)

ADX cluster

~5k

Event Hub Throughput Units

~500

Event Hub ingress events processing

~950

Platform logs

~9k

Total:

~15k

 

 

Implementation Guide

 

Create an Event Hub

 

Follow the steps in “Azure Quickstart - Create an event hub using the Azure portal - Azure Event Hubs | Microsoft Docs” to create an Event Hub. Be mindful of the SKU (Standard or Premium) and the required processing capacity; throughput units for Standard, and processing units for Premium. For a PoC, we recommend using the Standard SKU and enabling the auto-inflate feature (only available in Standard SKU), with min 1 and max 40 TUs to have full elasticity and discover the peak capacity needed.

 

Understand Azure Firewall logs and enable Structured logs

 

As of September 2022, a new type of Azure Firewall logs called "Structured logs" is in preview and can be generated in parallel with legacy log types. Here's an illustration of logs currently available from the Firewall's Diagnostic settings:

 

 

Log types:

  • Red: legacy logs
    • Only 3 types are covering the full set of logs (network, application, NAT, ThreatIntel, IDPS, DNS, FQDN resolution failure)
    • Examples here
  • Green: Structured logs
    • Preview feature stable for production: Structured logs doc
    • 8 different types, replacing the 3 legacy types
    • Examples here
  • Yellow: Fat Flow logs
    • Identifies top traffic flows
  • Blue: Aggregated logs for Policy Analytics
    • Policy Analytics doc (preview feature for Azure Firewall)
    • These logs must be sent to Log Analytics to be able to use Policy Analytics
    • Does not generate high amounts of logs

The new structured logs enable a more robust data ingestion into ADX than legacy logs. To enable the feature flag, follow these instructions.

 

Enable Diagnostic Settings on the Source

 

In your Azure Firewall (or other service), create a new diagnostic setting targeting the Event Hub you previously created. Send operations-relevant logs to it, and you may remove these log types from the other diagnostic setting targeting Log Analytics to avoid duplication (especially costs duplication). With the structured logs feature enabled on your Firewall, your final setup should look like this, with one diagnostic setting sending security-relevant logs to LAW, and a second diagnostic setting sending operations-relevant logs to Event Hub:

 

First diagnostic setting sending security logs + aggregation hits to Log Analytics / Sentinel:

 

 

Second diagnostic setting sending operational logs to Event Hub / ADX:

 

If using legacy logs, the diagnostic setting for ADX should only have the first 3 log types enabled (Azure Firewall Application Rule, Azure Firewall Network Rule, Azure Firewall DNS Proxy).

 

 

Create an ADX Cluster, Tables, Data Connection and an Update Policy

 

Follow these steps to create an ADX cluster and a database: Quickstart: Create an Azure Data Explorer cluster and database | Microsoft Docs

 

Run these KQL queries in ADX by locating your ADX cluster > Databases > select your database (previously created) > Query > copy paste code in the query interface (using Kusto Explorer or ADX Web Explorer) and run each block:

Disclaimer: these update policies are provided on a best effort basis and are not officially maintained or supported by Microsoft.

 

Running the KQL queries results in the creation of two tables: rawFirewallLogs (used as the destination for raw logs from Event Hub) and the structured consumer-ready table, networkFirewallLogs, which is updated by an update policy every time new data lands in the raw table. The function ExtractMyLogs() is used by the update policy to parse the content of the raw logs’ nested JSON using parsing functions to turn them into well defined, typed and query-enabled columns in the table networkFirewallLogs.

After running the above Kusto code, your ADX setup should look like this:

 

 

 

You now need to connect the database to your Event Hub: Ingest data from event hub into Azure Data Explorer | Microsoft Docs. Your connection settings should look like this:

 

 

 

Optional step: set a retention policy to manage logs lifecycle

 

 

Operational logs can be retained in ADX for a defined period and then removed using a retention policy: Kusto retention policy controls how data is removed - Azure Data Explorer | Microsoft Docs. This is the counterpart process of the Archive tier in LAW.

 

 

Test the End-to-End Solution

 

It may take up to 20 minutes for the first logs to traverse the entire pipeline; then, running the query “networkFirewallLogs” in ADX should present the results below, where each column has been properly parsed from the raw logs:

 

 

 

Troubleshooting

 

If no logs are present in the networkFirewallLogs table, try inspecting the raw table to see if the problem may come from the update policy by running “rawFirewallLogs” in ADX, which should show raw logs in the table:

 

 

If no logs at all are showing up in ADX after an hour, you may look into Event Hub metrics to see if events are flowing through it or not:

 

 

 

Summary

 

Azure Data Explorer as a companion solution to Azure Monitor Log Analytics workspace and Azure Sentinel provides a cost effective and scalable operational log monitoring solution to complement your NOC/SOC/SIEM/SOAR processes and solutions. With ADX, some one-off data engineering steps are needed to parse raw logs into tables with strongly typed columns using an update policy.

This will allow you to run rich, type aware, Kusto queries on your firewall logs in ADX while reducing your overall Azure bill!

 

Special thanks to Devang Shah (ADX Principal Program Manager) and Fernando Merino (Cloud Solution Architect) for the support

 

 

 

 

 

 

 

 

 

 

 

 

Updated Oct 04, 2022
Version 2.0