Move Your Microsoft Sentinel Logs to Long-Term Storage with Ease
Published Feb 02 2021 05:44 AM 244K Views
Microsoft

**UPDATE: Microsoft Sentinel now offers an easier and more efficient method to achieve long term storage. This solution is now out of date. Please refer to the recommended solution: What's New: Search, Basic Ingestion, Archive, and Data Restoration are Now in Public Preview - Micro...**

*Attention: There is an official solution that achieves the goal of this solution. Please refer to these documents on the current feature that can be used: 

Logic App: https://docs.microsoft.com/en-us/azure/azure-monitor/platform/logs-export-logic-app

Continuous export: https://docs.microsoft.com/en-us/azure/azure-monitor/platform/logs-data-export?tabs=portal

 

Out of the box, Microsoft Sentinel provides 90 days of data retention for free. In some parts of the world and within certain industries, there are regulations that organizations must adhere to which require data retention up to 7 years or longer. The current challenge is that the max retention for Log Analytics workspaces is 2 years. There has been a need for a solution that will allow for more time and cost saving by moving logs to cold storage. This blog is going to detail how logs from a Log Analytics workspace can easily be moved into long-term cold storage in order to comply with retention standards as well as reduce costs using this Playbook.

 

End Results:

Logs are put into Blobs within folders that are labeled by data type.Logs are put into Blobs within folders that are labeled by data type.

This post is going to be in-depth as it breaks down how the Playbook is going to operate.

 

TL:DR The end result once the Playbook is run is a folder system with Blobs within a storage account and container. The folders are labeled with and contain the data types that have been chosen for back up. Each Blob in the folder will contain a backup of logs for each data type in hourly blocks. Each Blob can be queried in a Log Analytics workspace using the externaldata operator and a SAS token URL generated for the Blob.

 

*Note: If your Blobs are a size of 2B, there was no information for that time block or data type.

 

Pre-requisites:

  • Log Analytics Workspace
  • Azure Storage Account or permissions to create a new one
  • One storage container or permissions to create one
  • The Logic App

For auditing and investigative purposes, raw data and logs may need to be stored long term for regulatory compliance. This can be achieved through a Playbook that queries the logs from the workspace that are about to expire and moves them to a storage account of your choosing. The Playbook utilizes Log Analytics API when performing the query. An important piece of information to note is that the API has a limitation of 500,000 rows of data per request. This means that if any data type contains more than 500,000 rows of information, it will require the Playbook to run a Log Analytics API pull more than once to get the data. In order to avoid this issue, the Playbook breaks up data for each day into hourly blocks in order to efficiently and carefully back up all desired data. Each time that the Playbook is run, it reviews each data type within your workspace that should be backed up, goes through the logs for those data types, and moves the logs to storage, labeling each backup with the data type, the date of the log, and the hour for the time block.

Matt_Lowe_0-1592184465562.png

Matt_Lowe_1-1592184507352.png

 

The app is comprised of many steps and variables in order for it to work. The pieces are the following:

  • Recurrence trigger - Automatically run the app daily without needing manual input.
  • hoursCount - Counter variable that is used for tracking which hour the Playbook is on for exporting data. Starts at 0 for midnight and caps at 24.
  • startDate - Oldest date in a range that the queries will be using. Value is set to the current date minus 89 days (formatDateTime(addDays(utcNow(), -89),'yyyy-MM-dd')). This serves as the starting point for the time range to pull data from.
  • endDate - Similar to startDate, this variable serves to be the tail end of the time range. This value is set to the current date minus 88 days (formatDateTime(addDays(utcNow(), -88),'yyyy-MM-dd')). The time range between startDate and endDate are now 89 days ago and 88 days ago. The purpose for this is to move logs to storage 24 hours before they are deleted from the workspace.
  • excludedDataType - This variable is used to hold a list of data types that should not be backed up when running the Playbook. The list of options will be used in the query that checks to see which data types should be backed up. It is recommended that the data types chosen are logs that are noisy or are not important.
  • Query #1 - This query sets which log types should not be included based on the excluded variable. The query then checks the usage details to see which data is present in the workspace between 89 days ago and 88 days ago.
  • Compose - Composes the data for parsing for future use in the app.
  • Parse JSON - Parses the message from the query so that each item from the query results can be referenced later in the Playbook. The sample JSON for the step is provided in the template.
  • All of the details are then passed into a ForEach loop that then goes into an until loop that checks if hoursCount is 24. Each time that the logs are exported for a data type, it will increment the counter by 1 until it hits 24, then moves onto the next data type.
  • Query #2 - This query takes in the value of each data type and takes the logs from hours 0-23 for the day in hourly blocks.
  • Compose #2 - The results are composed into a message for sending to the Blob.
  • Create Blob - A new Blob is created with the title of the data type, the date, and the hour for which the logs are from. These results are sent to the new Blob and will be housed in the container that you have named.

Depending on how much data is within each table, the Playbook can take 2 to 10 minutes per table depending on how much each table has.

 

Deployment of the Template:

There are two options for deploying the template:

  • deployment button directly in GitHub
  • manual deployment via deploying the JSON of the Playbook in the Azure Portal

If using the manual option, please refer to the README for the Playbook.

 

Using an Existing Storage Account:

 

In the event that an existing storage account is preferred, it can be set up within the Playbook. The following must be done:

  • Go to the Azure Portal
  • Go to the Playbook
  • Go to Logic App Designer
  • Go into the loop
  • Go to the Blob action
  • Click on 'change connection'
  • Click add new
  • Choose the desired storage account
  • Give the connection a name
  • Click create
  • Update the container name to be one from the existing storage account

 

Granting Permissions:

 

The Playbook is going to need proper permissions to run. It requires at least Blob Storage Contributor in order to create new Blobs within the data container. In order to provide the permissions:

  • Go to the Azure Portal
  • Go to Logic Apps
  • Choose the Logic App
  • Go to identity
  • Enable 'system assigned' identity
  • Click save
  • Go to either the subscription or the resource group that the storage account that is being used is in
  • Go to access control
  • Click add
  • Choose add role assignment
  • Under 'assign access to' choose Logic App
  • Find your Logic App
  • Select it
  • Choose the role 'Blob storage data contributor'
  • Click assign

 

Query the data:

Querying the data from 3 different blobs in order to see the data that was stored.Querying the data from 3 different blobs in order to see the data that was stored.

Once the data is in storage, it is still possible to query the day but in a smaller capacity. Querying the data now uses an operator called ‘externaldata’ which requires you to use a SAS token URL generated by the Blob in order to pull the data from it. The process also requires that each column be defined so that it can properly map the data to the correct column. An example of what the query would look like would be:

  • let ActivityLogs = externaldata (TimeGenerated:datetime, OperationName:string, OperationNameValue:string, Level:string, ActivityStatus:string, SubscriptionID:string)[@"SAS TOKEN URL FOR BLOB"] with (format="multijson",recreate_schema=true); ActivityLogs

This query

  • Creates a variable that will become the table
  • Establishes what the columns will be and their value type
  • Once the columns are made, the SAS token URL is inserted
  • Multiple URLs can be inserted in order to pull from multiple Blobs. Just separate the two by a comma and have them in their own quotations
    • [@”link 1”,”link 2”]
  • The query then calls the table that you have created, this pulls in the logs from the Blob and routes the data to the columns that were created

This query is going to pull the AzureActivity information from the Blob and maps the data to the associated columns. It is recommended to base your schema on the existing tables that were backed up in order to avoid any issues with parsing the data. In this case, AzureActivity was used as the reference for the associated logs that were in storage.

 

More information about external data: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/externaldata-operator

 

Summary:

 

With this Playbook, you now have control over logs being moved into long-term storage. This will be handy for existing data that is in the workspace that Sentinel is using when the retention period for the data runs out. This, in combination with the ability to query data from cold storage, allows for regulatory compliance and reduced costs while maintaining Sentinel log and Log Analytics usage for business operations.

 

This solution would not have been possible without the effort and great help from @Matt Egen@Chris Boehm , and Rin Ure.

24 Comments
Microsoft

Date from template is using incorrect date from my testing. It should be -88 days, it is currently listed as -28 days. Will be updating today.

 

 

*Issue has been fixed. If you have deployed the template, please make sure that you have the correct enddate of -88, not -28*

Copper Contributor

Hi @Matt_Lowe ! Thanks so much for posting this.

 

After deploying the logic app, I'm getting a bad request error on the first run:

 

"Message": "User input is invalid. Please check query syntax, Chart Type or other parameters data. Response Status=HttpJsonResponse: ResultStatus=NotFound, Response Content={\r\n  \"error\": {\r\n    \"code\": \"SubscriptionNotFound\",\r\n    \"message\": \"The subscription 'XXX-XX-SUB1' could not be found.\"\r\n  }\r\n}\r\nclientRequestId: 8e129f9c-17a8-4972-ba79-166c1e6fbf49"
 
Microsoft

@acoggins Thanks for bringing this up. Is this error message tied to the Azure Monitor action? It appears there might have been an issue with getting the subscription ID into the parameters for the query. If you go back to the designer > Azure Monitor, make sure that the subscription name is listed there and has a drop down arrow and not an 'x'. If possible, please paste a screenshot of the issue so that I can make sure that I am helping you with the right thing.

 

Matt_Lowe_0-1592507302052.png

 

Copper Contributor

They do have an X and not a dropdown! How do I update that? @Matt_Lowe 

acoggins_0-1592507508445.png

 

Microsoft

@acoggins click the x's and the drop down option will appear. Choose the subscription that you were trying to choose the first time and the rest of the items should be fixed. If not, click on the x's for those too and choose the correct options that you were trying to use. Let me know if that helps!

Copper Contributor

@Matt_Lowe that solved it! Thank you so much! We were just discussing how to make sure our Sentinel logs were stored long term, so your posting is very timely :smile:

Microsoft

Glad this could be of use! Let me know how it goes in your environment and if you have any suggestions on improvements. 

Copper Contributor

@Matt_Lowe Thanks for sharing this ... I'm using this to move logs to storage from sentinel and want to use SP instead of user. Do you know what specific permissions I need to assign to the SP?

 

I tested it with AAD SP with higher privileges with some observations...

 

- On the connection steps it doesn't let me save it if I do a SP based connection in the 'designer view' but using the code view allows to save it and I can move further... 

- I have to Authorize the two API endpoints for 'azure monitor'

 

Copper Contributor

@Matt_Lowe  hi.

Seems  Azure Log Analytics  Connector for LogicApps DEPRECATED. 

https://docs.microsoft.com/en-us/connectors/azureloganalytics/

Brass Contributor

@Matt_Lowe Thanks for the write-up, this is very help and I think it would be great to have this capability built into Sentinel. We are running into one issue with our deployment of this. Under the step 'Run query and list results 2' we are receiving the following error for some of our data tables. It sounds like it means that the table during the hour period is too large to call.

 

BadRequest. Http request failed as there is an error: 'Cannot write more bytes to the buffer than the configured maximum buffer size: 104857600.'.

 

Thanks

Microsoft

Hi,

Did you get 

BadRequest. Http request failed as there is an error: 'Cannot write more bytes to the buffer than the configured maximum buffer size: 104857600.'.

 

Resolved ?

 

If we have more than 500,000 rows  ?

Copper Contributor

Interesting blog, thanks. How do you limit the retention of the data that you're moving to blob storage? Do you have an additional task to remove this after a set time period? For example, if there's a requirement to store for 6 months.

Microsoft

Hi there,

 

Good question. You can find the steps for setting rules on Blobs: Optimize costs by automating Azure Blob Storage access tiers | Microsoft Docs

 

Take care!

Copper Contributor

Thanks Matt!

Copper Contributor

Hi @Matt_Lowe @Chris Boehm ,Rin thinks for this solution it is the best i have seen for custom tables   

one question thought at what sizes of daily ingestion GB/Day would you look to split the tables between multiple logic apps for performance. just looking for an indicator due to the limits of the connector

  • Log queries cannot return more than 500,000 rows.
  • Log queries cannot return more than 64,000,000 bytes.
  • Log queries cannot run longer than 10 minutes by default.
  • Log Analytics connector is limited to 100 call per minute.
Copper Contributor

Hi @Matt_Lowe , we're running into issues where when we attempt to copy the CommonSecurityLog, we are hitting the max size issue when cutting the logs up by the hour. We changed it to chop the logs by 30 minute increments and now get the error message 

"BadRequest. Http request failed as there is an error: 'Cannot write more bytes to the buffer than the configured maximum buffer size: 104857600.'"

 

Should we continue to decrease the time increments?

 

Many thanks!

Microsoft

@Rob_Walker Unfortunately there is a size limit for the API in the backend that hasn't been improved since this was made. The recommendation outside of lowering the time chunks is to have an export playbook for the different device vendors (Cisco, CheckPoint, etc) to move those logs by themselves in order to cut down on the total number of logs to move per run. 

 

Hopefully this helps but unfortunately with the API limitations in place there is no permanent solution. I would recommend looking into the continuous export feature moving forward for new data. 

 

Log Analytics workspace data export in Azure Monitor (preview) - Azure Monitor | Microsoft Docs

Copper Contributor

Hello,

 

I have a question. Is there possibility to create blobs not for 1 hour for (24h timestamp), but create one *.json blob for example divided in 12h. So in the end I will have
SecurityEvent-2021-07-16-12.json

SecurityEvent-2021-07-16-24.json

 

And when retrieving data from this json I will have to import 2 SAS URL which will have info for 07.16.2021

Brass Contributor

Is there anything that can be said about the troughput that can be reached with this or the now mentioned official solutions? We are an enterprise doing about 30.000 events per second during daytime (1,3 Billion events per 24 hours)

Edit [feb 2024]: we moved away from this solution as it is subject to query API limits (64MB / 500k rows) max result set size. It cannot keep up. If we have to look back because we missed an execution for one reason or another, we never catch up anymore. Resource specific diagnostic logs is the approach for us now.

Copper Contributor

Hello,

 

Some limitations currently:

 

- For companies that ingest a lot of events per second, hourly doesn't work because the API will not work past a certain response size.

- In those scenarios, you ll very likely need to query every minute. Appending to one file per hour won't work because variable sizes are also limited. So you end up with one file per minute.

- If the number of tables is big, the runs will take days. We cannot use concurrency due to variables not being instanced on each run inside logic apps.

- MSFT Continuous Export does not cover all of the tables unfortunately.

 

Any current workarounds or better solutions?

Copper Contributor

Hello @GabrielNecula 

the first thing is to make sure you are getting logs of value and quality.  70% of the sentinel environments i have inherited to maintain have not had the logs value check and been tuned and a lot of this is in the design that is decided for the environment.  the main issue with any cloud based SIEM its its easy to ingest logs and forget to tune like we had to with on-premises due to disk space. for example if you have ASA logs then even Cisco recommend tuning down some logs in a events for a SIEM like teardowns. i have seen this type of tuning drop the events by around a 3rd and not have any impact on the Audit or security value of the logs. but does really help customers with there ingestion costs.   

the nice thing about this solution is it can be mixed, for for high load e.g sizes or EPS yo can use MSFT Continuous Export exports for the main items and then this solution just for the custom tables, or you can have multiple runbooks that focus on set tables.  the API is a limiting factor but the Design of the SIEM needs to be looked at in a whole for what you are trying to achieve. 

Copper Contributor

Hello @Matt_Lowe

 

Thank you for this solution. I followed all the guidance but in my blob size there are only 2B, I cannot find anything inside the json. If it is possible, can you please let me know why this is happening and how possibly can I solve it?

HeinHtut_0-1640720361153.png

Best Regards,

Hein Htut

 

Copper Contributor

Hi @Matt_Lowe ,

 

Thank you for the solution. I tried this and what I'm getting within my storage account is JSON files with 2 B in size and nothing in it, same as @HeinHtut mentioned in the previous comment.

 

I have read in your post that if your Blobs are a size of 2B, there was no information for that time block or data type. But I have confirmed that there is enough information available.

 

Kindly revert with a solution. Thanks!

 

Vidyanand_1-1645614108349.png

 

Best Regards,

Vidyanand

 

 

 

Copper Contributor

What is the included/out-of-the-box Data Retention for Sentinel as of October 2023?
Is it 31 days? Or is it 90 days?

Co-Authors
Version history
Last update:
‎Nov 06 2023 12:19 PM
Updated by: