Check the health of your exported Azure Sentinel logs in your ADX cluster

Published Sep 07 2021 08:59 AM 1,328 Views
Microsoft

More and more Azure Sentinel customers are opting for long-term retention of their logs in Azure Data Explorer (ADX), either due to compliance regulations, or because they still want to be able to perform investigations on their archived logs in the event of a security incident.

As the Azure Sentinel ingestion price includes 90 days of retention for free, the option of keeping the logs for longer periods in Azure Data Explorer is preferred by many (see Using Azure Data Explorer for long term retention of Azure Sentinel logs - Microsoft Tech Community). 

 

Even though the Azure Sentinel + ADX solution requires little to no maintenance, we wanted to provide a solution for our customers to keep an eye on the number of events and overall status of their ADX clusters and databases. For this reason, we have created two tools: the ADXvsLA workbook and the ADX Health Playbook. The workbook will allow you to have a look at the number of logs on Azure Sentinel & ADX and the overall health of your ADX cluster. The playbook will send you a warning if an unexpected delay in the ingestion of ADX is detected.

 

 

Below, we will describe both in more detail:

 

ADXvsLA Workbook

 

When you open the workbook, you can select the following parameters:

  • the ADX cluster and database
  • the Azure Sentinel workspace from which the logs are exported to the aforementioned ADX cluster,
  • as well as the time range for which you want to see data

Use the Show Help toggle to see a detailed explanation of each section.

 

img1.png

Raw Tables

When you ingest logs from Azure Sentinel to ADX, the logs are first ingested into an intermediate table with raw data. This raw data is updated by a function with an update policy and is saved to its destination table with the correct mapping. Afterwards, the data is deleted, which is why you will typically see that these raw tables are empty. The retention policy should also be set for 0 days.

 

img2.png

 

Final ADX Tables

In this section, you will see information about the final ADX tables, which have the right schema and can be queried from Azure Sentinel. You will find information regarding the row count, size, retention policy and hot cache size etc.

 

img3.png

 

Select one of the table names to generate the comparison section. This is where you can see the differences between the table on ADX and on your Log Analytics workspace. Then, select the time range for which you want to see the comparison.

In the table you will find:

  • The number of entries in ADX, in Log Analytics, and the difference in number of logs between them.
  • How long it has been since the last log was received
  • The timestamp of the last logs.
  • The number of new logs received in Log Analytics since the last log in ADX was received

img4.png

 

Notice the New in Log Analytics column

    • In the screenshot, you can see there are 52 logs in the "New in Log Analytics" column. This means that, at the time we compared the tables, there were 52 entries that had not reached ADX yet.
      If this happens, you should compare the timestamp and the difference for the last log that was received. In this case, it is around 15 minutes. Delays of 30 minutes or less are expected, so this means your tables are working as expected.
    • It is also possible that you see a negative number in the New in Log Analytics column. This could happen if, due to the lag in ADX, there were Log Analytics logs from the previous period that were received in ADX during the current period. Let's suppose that you ingested 1000 logs in Log Analytics on the previous 24h window, but only 990 reached ADX in that period; and then you ingested 1000 logs again on the current 24h window, and all those logs, plus the 10 logs from the previous day, reached ADX. In this case, you will see that the "New in Log Analytics" column would say -10. In these cases, you only need to look at the LastTM difference. If it is around 30 minutes or less, then it will be fine.

     

Finally, at the bottom of the workbook you will see metrics regarding events received, events dropped, received data, volume and other metrics.

 

ADX Health Playbook

 

The ADX Health Playbook compares the number of logs in your Azure Sentinel tables and ADX tables periodically (every 24h by default) and sends you a warning via email if it detects a difference in the number of logs that may require your attention (that is, in the "New in Log Analytics" column mentioned previously). As it takes logs a few minutes to reach ADX after having been ingested into Log Analytics, the query in the playbook by default looks back at the period between the last 25h and last 30min.

Please read the accompanying readme.md file on GitHub to set it up.

 

We hope you find these tools useful! If you have any suggestions for improving this content or any questions, please leave us a comment.

2 Comments
Occasional Contributor

What can we do if we can confirm that the export to ADX is failing?

Microsoft

Hi @shoando 

 

If you see any discrepancies in the workbook then there can be an issue on the pipeline side or on adx side.

For the pipeline: the pipeline itself will only start dropping data after 30min of retry (but sadly, no event is generated to inform on that)
(Log Analytics workspace data export in Azure Monitor (preview) - Azure Monitor | Microsoft Docs)
For ADX side: you can verify that when throttled, appropriate measures are taken (Log Analytics workspace data export in Azure Monitor (preview) - Azure Monitor | Microsoft Docs)

If everything seems fine, it is best to contact support to investigate the case further

 

Regards

%3CLINGO-SUB%20id%3D%22lingo-sub-2668363%22%20slang%3D%22en-US%22%3ECheck%20the%20health%20of%20your%20exported%20Azure%20Sentinel%20logs%20in%20your%20ADX%20cluster%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2668363%22%20slang%3D%22en-US%22%3E%3CP%3E%3CSPAN%3EMore%20and%20more%20Azure%20Sentinel%20customers%20are%20opting%20for%20long-term%20retention%20of%20their%20logs%20in%20Azure%20Data%20Explorer%20(ADX)%2C%20either%20due%20to%20compliance%20regulations%2C%20or%20because%20they%20still%20want%20to%20be%20able%20to%20perform%20investigations%20on%20their%20archived%20logs%20in%20the%20event%20of%20a%20security%20incident.%20%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%3EAs%20the%20Azure%20Sentinel%20ingestion%20price%20includes%2090%20days%20of%20retention%20for%20free%2C%20the%20option%20of%20keeping%20the%20logs%20for%20longer%20periods%20in%20Azure%20Data%20Explorer%20is%20preferred%20by%20many%20(see%20%3C%2FSPAN%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fazure-sentinel%2Fusing-azure-data-explorer-for-long-term-retention-of-azure%2Fba-p%2F1883947%22%20target%3D%22_blank%22%3E%3CSPAN%3EUsing%20Azure%20Data%20Explorer%20for%20long%20term%20retention%20of%20Azure%20Sentinel%20logs%20-%20Microsoft%20Tech%20Community%3C%2FSPAN%3E%3C%2FA%3E%3CSPAN%3E).%26nbsp%3B%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EEven%20though%20the%20Azure%20Sentinel%20%2B%20ADX%20solution%20requires%20little%20to%20no%20maintenance%2C%20we%20wanted%20to%20provide%20a%20solution%20for%20our%20customers%20to%20keep%20an%20eye%20on%20the%20number%20of%20events%20and%20overall%20status%20of%20their%20ADX%20clusters%20and%20databases.%26nbsp%3B%3CSPAN%3EFor%20this%20reason%2C%20%3CSTRONG%3Ewe%20have%20created%20two%20tools%3A%20the%20%3C%2FSTRONG%3E%3C%2FSPAN%3E%3CSTRONG%3E%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2FAzure%2FAzure-Sentinel%2Fblob%2Fmaster%2FWorkbooks%2FADXvsLA.json%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3EADXvsLA%20workbook%3C%2FA%3E%20and%20the%20%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2FAzure%2FAzure-Sentinel%2Ftree%2Fmaster%2FPlaybooks%2FADX-Health-Playbook%22%20target%3D%22_self%22%20rel%3D%22noopener%20noreferrer%22%3EADX%20Health%20Playbook%3C%2FA%3E%3C%2FSTRONG%3E.%20The%20workbook%20will%20allow%20you%20to%20have%20a%20look%20at%20the%20number%20of%20logs%20on%20Azure%20Sentinel%20%26amp%3B%20ADX%20and%20the%20overall%20health%20of%20your%20ADX%20cluster.%20The%20playbook%20will%20send%20you%20a%20warning%20if%20an%20unexpected%20delay%20in%20the%20ingestion%20of%20ADX%20is%20detected.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EBelow%2C%20we%20will%20describe%20both%20in%20more%20detail%3A%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CH1%20id%3D%22toc-hId-1416916937%22%20id%3D%22toc-hId-1441879786%22%20id%3D%22toc-hId-1441879786%22%20id%3D%22toc-hId-1441879786%22%3E%3CFONT%20size%3D%225%22%3EADXvsLA%20Workbook%3C%2FFONT%3E%3C%2FH1%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EWhen%20you%20open%20the%20workbook%2C%20you%20can%20select%20the%20following%20parameters%3A%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3Ethe%20ADX%20cluster%20and%20database%3C%2FLI%3E%0A%3CLI%3Ethe%20Azure%20Sentinel%20workspace%20from%20which%20the%20logs%20are%20exported%20to%20the%20aforementioned%20ADX%20cluster%2C%3C%2FLI%3E%0A%3CLI%3Eas%20well%20as%20the%20time%20range%20for%20which%20you%20want%20to%20see%20data%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3EUse%20the%20Show%20Help%20toggle%20to%20see%20a%20detailed%20explanation%20of%20each%20section.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22img1.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F308805i855F1EB489A4F37C%2Fimage-size%2Flarge%3Fv%3Dv2%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22img1.png%22%20alt%3D%22img1.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CFONT%20size%3D%224%22%3E%3CSTRONG%3ERaw%20Tables%3C%2FSTRONG%3E%3C%2FFONT%3E%3C%2FP%3E%0A%3CP%3E%3CFONT%20size%3D%223%22%3EW%3C%2FFONT%3Ehen%20you%20ingest%20logs%20from%20Azure%20Sentinel%20to%20ADX%2C%20the%20logs%20are%20first%20ingested%20into%20an%20intermediate%20table%20with%20raw%20data.%20This%20raw%20data%20is%20updated%20by%20a%20function%20with%20an%20update%20policy%20and%20is%20saved%20to%20its%20destination%20table%20with%20the%20correct%20mapping.%20Afterwards%2C%20the%20data%20is%20deleted%2C%20which%20is%20why%20you%20will%20typically%20see%20that%20these%20raw%20tables%20are%20empty.%20The%20retention%20policy%20should%20also%20be%20set%20for%200%20days.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%20class%3D%22lia-indent-padding-left-60px%22%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22img2.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F308806iAA971735575B917F%2Fimage-size%2Flarge%3Fv%3Dv2%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22img2.png%22%20alt%3D%22img2.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CFONT%20size%3D%224%22%3E%3CSTRONG%3EFinal%20ADX%20Tables%3C%2FSTRONG%3E%3C%2FFONT%3E%3C%2FP%3E%0A%3CP%3EIn%20this%20section%2C%20you%20will%20see%20information%20about%20the%20final%20ADX%20tables%2C%20which%20have%20the%20right%20schema%20and%20can%20be%20queried%20from%20Azure%20Sentinel.%20You%20will%20find%20information%20regarding%20the%20row%20count%2C%20size%2C%20retention%20policy%20and%20hot%20cache%20size%20etc.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22img3.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F308807iE06443C1182BA91E%2Fimage-size%2Flarge%3Fv%3Dv2%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22img3.png%22%20alt%3D%22img3.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3ESelect%20one%20of%20the%20table%20names%20to%20generate%20the%20comparison%20section.%20This%20is%20where%20you%20can%20see%20the%20differences%20between%20the%20table%20on%20ADX%20and%20on%20your%20Log%20Analytics%20workspace.%20Then%2C%20select%20the%20time%20range%20for%20which%20you%20want%20to%20see%20the%20comparison.%3C%2FP%3E%0A%3CP%3EIn%20the%20table%20you%20will%20find%3A%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EThe%20number%20of%20entries%20in%20ADX%2C%20in%20Log%20Analytics%2C%20and%20the%20difference%20in%20number%20of%20logs%20between%20them.%3C%2FLI%3E%0A%3CLI%3EHow%20long%20it%20has%20been%20since%20the%20last%20log%20was%20received%3C%2FLI%3E%0A%3CLI%3EThe%20timestamp%20of%20the%20last%20logs.%3C%2FLI%3E%0A%3CLI%3EThe%20number%20of%20new%20logs%20received%20in%20Log%20Analytics%20since%20the%20last%20log%20in%20ADX%20was%20received%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%20class%3D%22lia-indent-padding-left-30px%22%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22img4.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F308808i5889FC6565027F69%2Fimage-size%2Flarge%3Fv%3Dv2%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22img4.png%22%20alt%3D%22img4.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%20class%3D%22lia-align-justify%20lia-indent-padding-left-30px%22%3ENotice%20the%20%3CSTRONG%3ENew%20in%20Log%20Analytics%20column%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%20style%3D%22list-style-type%3A%20none%3B%22%3E%3CUL%3E%0A%3CLI%3EIn%20the%20screenshot%2C%20you%20can%20see%20there%20are%2052%20logs%20in%20the%20%22New%20in%20Log%20Analytics%22%20column.%20This%20means%20that%2C%20at%20the%20time%20we%20compared%20the%20tables%2C%20there%20were%2052%20entries%20that%20had%20not%20reached%20ADX%20yet.%20%3CBR%20%2F%3EIf%20this%20happens%2C%20you%20should%20compare%20the%20timestamp%20and%20the%20difference%20for%20the%20last%20log%20that%20was%20received.%20In%20this%20case%2C%20it%20is%20around%2015%20minutes.%20Delays%20of%2030%20minutes%20or%20less%20are%20expected%2C%20so%20this%20means%20your%20tables%20are%20working%20as%20expected.%3C%2FLI%3E%0A%3CLI%3EIt%20is%20also%20possible%20that%20you%20see%20a%20negative%20number%20in%20the%20New%20in%20Log%20Analytics%20column.%20This%20could%20happen%20if%2C%20due%20to%20the%20lag%20in%20ADX%2C%20there%20were%20Log%20Analytics%20logs%20from%20the%20previous%20period%20that%20were%20received%20in%20ADX%20during%20the%20current%20period.%20Let's%20suppose%20that%20you%20ingested%201000%20logs%20in%20Log%20Analytics%20on%20the%20previous%2024h%20window%2C%20but%20only%20990%20reached%20ADX%20in%20that%20period%3B%20and%20then%20you%20ingested%201000%20logs%20again%20on%20the%20current%2024h%20window%2C%20and%20all%20those%20logs%2C%20plus%20the%2010%20logs%20from%20the%20previous%20day%2C%20reached%20ADX.%20In%20this%20case%2C%20you%20will%20see%20that%20the%20%22New%20in%20Log%20Analytics%22%20column%20would%20say%20-10.%20In%20these%20cases%2C%20you%20only%20need%20to%20look%20at%20the%20LastTM%20difference.%20If%20it%20is%20around%2030%20minutes%20or%20less%2C%20then%20it%20will%20be%20fine.%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3EFinally%2C%20at%20the%20bottom%20of%20the%20workbook%20you%20will%20see%20metrics%20regarding%20events%20received%2C%20events%20dropped%2C%20received%20data%2C%20volume%20and%20other%20metrics.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CH1%20id%3D%22toc-hId--390537526%22%20id%3D%22toc-hId--365574677%22%20id%3D%22toc-hId--365574677%22%20id%3D%22toc-hId--365574677%22%3E%3CFONT%20size%3D%225%22%3EADX%20Health%20Playbook%3C%2FFONT%3E%3C%2FH1%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThe%20ADX%20Health%20Playbook%20compares%20the%20number%20of%20logs%20in%20your%20Azure%20Sentinel%20tables%20and%20ADX%20tables%20periodically%20(every%2024h%20by%20default)%20and%20sends%20you%20a%20warning%20via%20email%20if%20it%20detects%20a%20difference%20in%20the%20number%20of%20logs%20that%20may%20require%20your%20attention%20(that%20is%2C%20in%20the%20%22New%20in%20Log%20Analytics%22%20column%20mentioned%20previously).%20As%20it%20takes%20logs%20a%20few%20minutes%20to%20reach%20ADX%20after%20having%20been%20ingested%20into%20Log%20Analytics%2C%20the%20query%20in%20the%20playbook%20by%20default%20looks%20back%20at%20the%20period%20between%20the%20last%2025h%20and%20last%2030min.%3C%2FP%3E%0A%3CP%3EPlease%20read%20the%20accompanying%20%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2FAzure%2FAzure-Sentinel%2Ftree%2Fmaster%2FPlaybooks%2FADX-Health-Playbook%22%20target%3D%22_self%22%20rel%3D%22noopener%20noreferrer%22%3Ereadme.md%20file%20on%20GitHub%3C%2FA%3E%20to%20set%20it%20up.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EWe%20hope%20you%20find%20these%20tools%20useful!%20If%20you%20have%20any%20suggestions%20for%20improving%20this%20content%20or%20any%20questions%2C%20please%20leave%20us%20a%20comment.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-TEASER%20id%3D%22lingo-teaser-2668363%22%20slang%3D%22en-US%22%3E%3CP%3EThe%20ADX%20health%20playbook%20and%20the%20ADXvsLA%20workbook%20haven%20been%20created%20to%20keep%20track%20of%20your%20Azure%20Sentinel%20logs%20that%20are%20being%20ingested%20into%20ADX.%20The%20play-%20and%20workbook%20will%20help%20in%20monitoring%26nbsp%3B%20the%20number%20of%20events%20and%20the%20overall%20status%20of%20your%20ADX%20clusters%20and%20databases.%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22fix%20pic.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F308922iC3EA6A86E97AA1E1%2Fimage-size%2Flarge%3Fv%3Dv2%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22fix%20pic.png%22%20alt%3D%22fix%20pic.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-TEASER%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2738482%22%20slang%3D%22en-US%22%3ERe%3A%20Check%20the%20health%20of%20your%20exported%20Azure%20Sentinel%20logs%20in%20your%20ADX%20cluster%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2738482%22%20slang%3D%22en-US%22%3E%3CP%3E%3CSPAN%20class%3D%22VIiyi%22%3E%3CSPAN%20class%3D%22JLqJ4b%20ChMk0b%22%3E%3CSPAN%3EWhat%20can%20we%20do%20if%20we%20can%20confirm%20that%20the%20export%20to%20ADX%20is%20failing%3F%3C%2FSPAN%3E%3C%2FSPAN%3E%3C%2FSPAN%3E%3C%2FP%3E%3C%2FLINGO-BODY%3E
Version history
Last update:
‎Sep 08 2021 02:43 AM
Updated by: