Azure WAF – Masking Sensitive Data

Microsoft

Aug 21, 2023

Introduction

Azure Web Application Firewall (WAF) on Azure Application Gateway provides centralized protection of web applications from common web application vulnerabilities. You can monitor how your Azure WAF resources are processing the traffic using the WAF logs which are written to the designated location i.e., Log Analytics Workspace, Storage Account, Partner Solution etc. These logs typically contain requests which are matched or blocked by the WAF rules and can provide valuable information for monitoring, auditing, and troubleshooting purposes. For ease of usage and analysis, WAF logs are written in plain text format by default. Client requests to web applications can also contain sensitive data, such as personally identifiable information (PII). Potentially malicious requests from a client can be matched or blocked by WAF rules and subsequently written to the WAF logs along with any identifiable information in the request. PII includes data such as names, addresses, phone numbers, email addresses, social security numbers, credit card numbers, and so on. If plain text logs are not properly scrubbed of such data, they could be at risk of unauthorized access and disclosure. Azure WAF log scrubbing tool helps you remove sensitive data from your WAF logs. It works by using a rules engine that allows you to build custom rules to identify specific portions of a request that contain sensitive data. Once identified, the tool scrubs that information from your logs and replaces it with *******. In this blog, we’ll cover examples of the log scrubbing feature using our Sensitive Data Lab you can find here.

Log Scrubbing

The log scrubbing feature works for all rule sets associated with the WAF policy. This includes Core Rule Set (CRS), Default Rule Set (DRS), Bot Manager Ruleset, and Custom rules. There are a variety of different match variables as well, including the client IP address, headers, cookies, and arguments passed through with the request. When creating the rules, you’ll select which match variable to use, which operator to use, and then you’ll define the selector. The selector specifies the key whose value you want to remove from the logs. For example, a simple login typically uses the username and password fields. Username and password are 2 separate keys that you can define as a selector. If a suspicious login attempt triggers the WAF, it will log the username and password used during login if the malicious string or injection was sent in those fields. The log scrubber engine will replace these values so we won't be able to see the malicious string used in the attack, but we can still identify attributes of the attack and the potential for PII to leak has been mitigated.

See below for the full list of Match Variables:

Match Variable	Operator	Selector
Request IP Address	Equals any	<None>
Request Header Names	Equals/Equals any	<Custom>
Request Cookie Names	Equals/Equals any	<Custom>
Request Arg Names	Equals/Equals any	<Custom>
Request Post Arg Names	Equals/Equals any	<Custom>
Request JSON Arg Names	Equals/Equals any	<Custom>

Request IP & Request Header

Let’s start with our first example, where we’ll look at how the scrubbing engine will hide the requestor’s IP address, as well as the User-Agent that was used to trigger the WAF rule. The User-Agent being used is called ‘datacha0s’ and is a known vulnerability scanner according to OWASP. Our log scrubbing rule is written where the selector has User-Agent defined for Request Header Names, this will scrub all instances of a User-Agent in the WAF logs, including our malicious agent ‘datacha0s’. This example is purely for demonstration on how the log scrubbing engine determines what values to scrub. It is recommended to only define values that have potential PII or sensitive information, not necessarily the request header, User-Agent.

With the rules defined and the feature enabled, we’ll send a request using Postman that will trigger a block by the WAF and then check on the logs. Our screenshot below shows a 403 Forbidden status code returned from the Azure WAF policy.

Viewing the logs, we can see the columns clientIp_s and details_data_s now have ***** as the value or part of the message. Although we cannot see what User-Agent was used in this attack with the logs, we can still identify that there was an attempt on the site using a vulnerability scanner and where to find additional details in the OWASP GitHub repository. If you’re following along with the Sensitive Data lab linked previously, here is the Kusto query used seen in the screenshot:

AzureDiagnostics

| where Category == "ApplicationGatewayFirewallLog"

| project TimeGenerated, Resource, requestUri_s, Message, details_message_s, details_file_s, details_line_s, clientIp_s, ruleSetType_s, ruleSetVersion_s, ruleId_s, ruleGroup_s, action_s, details_data_s, hostname_s, transactionId_g

Request IP, Request Cookie, & Request JSON Arg

In the next example, we’ll see how cookies and JSON arguments are scrubbed from the logs using Postman. The log scrubbing rule defined ‘Cookie_1’ as the selector for Request Cookie Names and defined ‘password’ as the selector for Request JSON Arg Names. The log scrubbing engine will only mask the value of cookies that are named Cookie_1, not all cookies that are sent to the site. You can define multiple cookies that an application may use if there is a potential for sensitive data leakage. For our example, our cookie will set off SQL injection rules due to the special characters, causing the WAF to generate a log for the match.

Additionally, we have another SQL injection attack within our JSON body of the request. If your application uses JSONs for request bodies, you can define the keys of the JSON for the engine to scrub. In this case, the selector we chose is ‘password’, so the SQL injection attack that is embedded in the password value will be scrubbed from the logs.

This is a simple example with our JSON request only being 1 level. If the password field was deeper in the JSON request, say under ‘properties’ and then ‘credentials’, the selector would look like this: properties.credentials.password

{

“properties”: {
"credentials": {
"email": “admin”,

“password”: “’ or 1=1--”

}

The WAF logs show our client IP scrubbed as well as the value for ‘Cookie_1’ and ‘password’. Even with multiple areas of the request triggering the incident, the log scrubbing engine still successfully removes the value from the logs.

Conclusion

The Azure WAF log scrubbing tool helps organizations maintain trust with handling of sensitive information and PII from their logs. It reduces the risk of legal or regulatory violations that may result from exposing personal or confidential data and it helps to maintain the integrity and reliability of the logs as a source of evidence or troubleshooting. Scrubbing sensitive information from logs is a good practice that should be followed by all administrators who work with logging systems. To learn more about Azure WAF, check the resources linked below.