Move Your Azure Sentinel Logs to Long-Term Storage with Ease

Published Feb 02 2021 05:44 AM 27.3K Views

*Attention: There is an official solution that achieves the goal of this solution. Please refer to these documents on the current feature that can be used: 

Logic App:

Continuous export:


Out of the box, Azure Sentinel provides 90 days of data retention for free. In some parts of the world and within certain industries, there are regulations that organizations must adhere to which require data retention up to 7 years or longer. The current challenge is that the max retention for Log Analytics workspaces is 2 years. There has been a need for a solution that will allow for more time and cost saving by moving logs to cold storage. This blog is going to detail how logs from a Log Analytics workspace can easily be moved into long-term cold storage in order to comply with retention standards as well as reduce costs using this Playbook.


End Results:

Logs are put into Blobs within folders that are labeled by data type.Logs are put into Blobs within folders that are labeled by data type.

This post is going to be in-depth as it breaks down how the Playbook is going to operate.


TL:DR The end result once the Playbook is run is a folder system with Blobs within a storage account and container. The folders are labeled with and contain the data types that have been chosen for back up. Each Blob in the folder will contain a backup of logs for each data type in hourly blocks. Each Blob can be queried in a Log Analytics workspace using the externaldata operator and a SAS token URL generated for the Blob.


Link for the Playbook if needed:


*Note: If your Blobs are a size of 2B, there was no information for that time block or data type.



  • Log Analytics Workspace
  • Azure Storage Account or permissions to create a new one
  • One storage container or permissions to create one
  • The Logic App

For auditing and investigative purposes, raw data and logs may need to be stored long term for regulatory compliance. This can be achieved through a Playbook that queries the logs from the workspace that are about to expire and moves them to a storage account of your choosing. The Playbook utilizes Log Analytics API when performing the query. An important piece of information to note is that the API has a limitation of 500,000 rows of data per request. This means that if any data type contains more than 500,000 rows of information, it will require the Playbook to run a Log Analytics API pull more than once to get the data. In order to avoid this issue, the Playbook breaks up data for each day into hourly blocks in order to efficiently and carefully back up all desired data. Each time that the Playbook is run, it reviews each data type within your workspace that should be backed up, goes through the logs for those data types, and moves the logs to storage, labeling each backup with the data type, the date of the log, and the hour for the time block.




The app is comprised of many steps and variables in order for it to work. The pieces are the following:

  • Recurrence trigger - Automatically run the app daily without needing manual input.
  • hoursCount - Counter variable that is used for tracking which hour the Playbook is on for exporting data. Starts at 0 for midnight and caps at 24.
  • startDate - Oldest date in a range that the queries will be using. Value is set to the current date minus 89 days (formatDateTime(addDays(utcNow(), -89),'yyyy-MM-dd')). This serves as the starting point for the time range to pull data from.
  • endDate - Similar to startDate, this variable serves to be the tail end of the time range. This value is set to the current date minus 88 days (formatDateTime(addDays(utcNow(), -88),'yyyy-MM-dd')). The time range between startDate and endDate are now 89 days ago and 88 days ago. The purpose for this is to move logs to storage 24 hours before they are deleted from the workspace.
  • excludedDataType - This variable is used to hold a list of data types that should not be backed up when running the Playbook. The list of options will be used in the query that checks to see which data types should be backed up. It is recommended that the data types chosen are logs that are noisy or are not important.
  • Query #1 - This query sets which log types should not be included based on the excluded variable. The query then checks the usage details to see which data is present in the workspace between 89 days ago and 88 days ago.
  • Compose - Composes the data for parsing for future use in the app.
  • Parse JSON - Parses the message from the query so that each item from the query results can be referenced later in the Playbook. The sample JSON for the step is provided in the template.
  • All of the details are then passed into a ForEach loop that then goes into an until loop that checks if hoursCount is 24. Each time that the logs are exported for a data type, it will increment the counter by 1 until it hits 24, then moves onto the next data type.
  • Query #2 - This query takes in the value of each data type and takes the logs from hours 0-23 for the day in hourly blocks.
  • Compose #2 - The results are composed into a message for sending to the Blob.
  • Create Blob - A new Blob is created with the title of the data type, the date, and the hour for which the logs are from. These results are sent to the new Blob and will be housed in the container that you have named.

Depending on how much data is within each table, the Playbook can take 2 to 10 minutes per table depending on how much each table has.


Deployment of the Template:

There are two options for deploying the template:

  • deployment button directly in GitHub
  • manual deployment via deploying the JSON of the Playbook in the Azure Portal

If using the manual option, please refer to the README for the Playbook.


Using an Existing Storage Account:


In the event that an existing storage account is preferred, it can be set up within the Playbook. The following must be done:

  • Go to the Azure Portal
  • Go to the Playbook
  • Go to Logic App Designer
  • Go into the loop
  • Go to the Blob action
  • Click on 'change connection'
  • Click add new
  • Choose the desired storage account
  • Give the connection a name
  • Click create
  • Update the container name to be one from the existing storage account


Granting Permissions:


The Playbook is going to need proper permissions to run. It requires at least Blob Storage Contributor in order to create new Blobs within the data container. In order to provide the permissions:

  • Go to the Azure Portal
  • Go to Logic Apps
  • Choose the Logic App
  • Go to identity
  • Enable 'system assigned' identity
  • Click save
  • Go to either the subscription or the resource group that the storage account that is being used is in
  • Go to access control
  • Click add
  • Choose add role assignment
  • Under 'assign access to' choose Logic App
  • Find your Logic App
  • Select it
  • Choose the role 'Blob storage data contributor'
  • Click assign


Query the data:

Querying the data from 3 different blobs in order to see the data that was stored.Querying the data from 3 different blobs in order to see the data that was stored.

Once the data is in storage, it is still possible to query the day but in a smaller capacity. Querying the data now uses an operator called ‘externaldata’ which requires you to use a SAS token URL generated by the Blob in order to pull the data from it. The process also requires that each column be defined so that it can properly map the data to the correct column. An example of what the query would look like would be:

  • let ActivityLogs = externaldata (TimeGenerated:datetime, OperationName:string, OperationNameValue:string, Level:string, ActivityStatus:string, SubscriptionID:string)[@"SAS TOKEN URL FOR BLOB"] with (format="multijson",recreate_schema=true); ActivityLogs

This query

  • Creates a variable that will become the table
  • Establishes what the columns will be and their value type
  • Once the columns are made, the SAS token URL is inserted
  • Multiple URLs can be inserted in order to pull from multiple Blobs. Just separate the two by a comma and have them in their own quotations
    • [@”link 1”,”link 2”]
  • The query then calls the table that you have created, this pulls in the logs from the Blob and routes the data to the columns that were created

This query is going to pull the AzureActivity information from the Blob and maps the data to the associated columns. It is recommended to base your schema on the existing tables that were backed up in order to avoid any issues with parsing the data. In this case, AzureActivity was used as the reference for the associated logs that were in storage.


More information about external data:




With this Playbook, you now have control over logs being moved into long-term storage. This will be handy for existing data that is in the workspace that Sentinel is using when the retention period for the data runs out. This, in combination with the ability to query data from cold storage, allows for regulatory compliance and reduced costs while maintaining Sentinel log and Log Analytics usage for business operations.


This solution would not have been possible without the effort and great help from @Matt Egen@Chris Boehm , and Rin Ure.


Date from template is using incorrect date from my testing. It should be -88 days, it is currently listed as -28 days. Will be updating today.



*Issue has been fixed. If you have deployed the template, please make sure that you have the correct enddate of -88, not -28*

Regular Visitor

Hi @Matt_Lowe ! Thanks so much for posting this.


After deploying the logic app, I'm getting a bad request error on the first run:


"Message": "User input is invalid. Please check query syntax, Chart Type or other parameters data. Response Status=HttpJsonResponse: ResultStatus=NotFound, Response Content={\r\n  \"error\": {\r\n    \"code\": \"SubscriptionNotFound\",\r\n    \"message\": \"The subscription 'XXX-XX-SUB1' could not be found.\"\r\n  }\r\n}\r\nclientRequestId: 8e129f9c-17a8-4972-ba79-166c1e6fbf49"

@acoggins Thanks for bringing this up. Is this error message tied to the Azure Monitor action? It appears there might have been an issue with getting the subscription ID into the parameters for the query. If you go back to the designer > Azure Monitor, make sure that the subscription name is listed there and has a drop down arrow and not an 'x'. If possible, please paste a screenshot of the issue so that I can make sure that I am helping you with the right thing.




Regular Visitor

They do have an X and not a dropdown! How do I update that? @Matt_Lowe 




@acoggins click the x's and the drop down option will appear. Choose the subscription that you were trying to choose the first time and the rest of the items should be fixed. If not, click on the x's for those too and choose the correct options that you were trying to use. Let me know if that helps!

Regular Visitor

@Matt_Lowe that solved it! Thank you so much! We were just discussing how to make sure our Sentinel logs were stored long term, so your posting is very timely :smile:


Glad this could be of use! Let me know how it goes in your environment and if you have any suggestions on improvements. 

Occasional Visitor

@Matt_Lowe Thanks for sharing this ... I'm using this to move logs to storage from sentinel and want to use SP instead of user. Do you know what specific permissions I need to assign to the SP?


I tested it with AAD SP with higher privileges with some observations...


- On the connection steps it doesn't let me save it if I do a SP based connection in the 'designer view' but using the code view allows to save it and I can move further... 

- I have to Authorize the two API endpoints for 'azure monitor'


Occasional Visitor

@Matt_Lowe  hi.

Seems  Azure Log Analytics  Connector for LogicApps DEPRECATED.


@Matt_Lowe Thanks for the write-up, this is very help and I think it would be great to have this capability built into Sentinel. We are running into one issue with our deployment of this. Under the step 'Run query and list results 2' we are receiving the following error for some of our data tables. It sounds like it means that the table during the hour period is too large to call.


BadRequest. Http request failed as there is an error: 'Cannot write more bytes to the buffer than the configured maximum buffer size: 104857600.'.





Did you get 

BadRequest. Http request failed as there is an error: 'Cannot write more bytes to the buffer than the configured maximum buffer size: 104857600.'.


Resolved ?


If we have more than 500,000 rows  ?

Senior Member

Interesting blog, thanks. How do you limit the retention of the data that you're moving to blob storage? Do you have an additional task to remove this after a set time period? For example, if there's a requirement to store for 6 months.


Hi there,


Good question. You can find the steps for setting rules on Blobs: Optimize costs by automating Azure Blob Storage access tiers | Microsoft Docs


Take care!

Senior Member

Thanks Matt!

Senior Member

Hi @Matt_Lowe @Chris Boehm ,Rin thinks for this solution it is the best i have seen for custom tables   

one question thought at what sizes of daily ingestion GB/Day would you look to split the tables between multiple logic apps for performance. just looking for an indicator due to the limits of the connector

  • Log queries cannot return more than 500,000 rows.
  • Log queries cannot return more than 64,000,000 bytes.
  • Log queries cannot run longer than 10 minutes by default.
  • Log Analytics connector is limited to 100 call per minute.
Occasional Visitor

Hi @Matt_Lowe , we're running into issues where when we attempt to copy the CommonSecurityLog, we are hitting the max size issue when cutting the logs up by the hour. We changed it to chop the logs by 30 minute increments and now get the error message 

"BadRequest. Http request failed as there is an error: 'Cannot write more bytes to the buffer than the configured maximum buffer size: 104857600.'"


Should we continue to decrease the time increments?


Many thanks!


@Rob_Walker Unfortunately there is a size limit for the API in the backend that hasn't been improved since this was made. The recommendation outside of lowering the time chunks is to have an export playbook for the different device vendors (Cisco, CheckPoint, etc) to move those logs by themselves in order to cut down on the total number of logs to move per run. 


Hopefully this helps but unfortunately with the API limitations in place there is no permanent solution. I would recommend looking into the continuous export feature moving forward for new data. 


Log Analytics workspace data export in Azure Monitor (preview) - Azure Monitor | Microsoft Docs

New Contributor



I have a question. Is there possibility to create blobs not for 1 hour for (24h timestamp), but create one *.json blob for example divided in 12h. So in the end I will have



And when retrieving data from this json I will have to import 2 SAS URL which will have info for 07.16.2021

Version history
Last update:
‎Feb 02 2021 05:43 AM
Updated by: