Blog Post

Microsoft Sentinel Blog

5 MIN READ

New ingestion-SampleData-as-a-service solution, for a great Demos and simulation

Microsoft

Aug 12, 2022

Demonstrating Microsoft Sentinel features, that include security incidents, alerts, workbooks, meaningful hunting queries,

Helping our internal teams, partners, and customers, to explore and present microsoft Sentinel capabilities.

For any SIEM solution, good demos and simulations rely on predictable ingested data.

Our new Sentinel ingestion-SampleData-as-a-service uses Azure monitor new API to ingest and manipulate raw events into Sentinel instances.

This tool provides a simple way to ingest sample data at one time or in a scheduled manner into a built-in table or a custom table. it accepts log files (JSON or CSV) hosted on GitHub public repositories or Azure Storage accounts (with SAS key protection).

Users can also transform these logs before they're sent to the destination table with this solution.

We can use this solution to ingest data on demand into the above-mentioned tables:

SecurityEvent
Syslog
WindowsEvent
CommonSecurity
ASimDnsActivityLogs
Custom tables

This tool can be used to address the following business use cases:

Detection simulation – The detection engine in Microsoft Sentinel uses KQL query logic after the raw events are ingested into the system. If it finds a match, it creates an incident and an alert. Through this new tool, detection engineers will be able to ingest security data and apply transformation to control the entities and fields that will be exposed in the detection. The tool will also be used to test the built-in detection and the newly created detection.
Demo Lab with live incidents and Workbooks – In order to demonstrate Sentinel functionality and train the SOC on investigation procedures and Sentinel features, both customers and partners need a live demo environment with continuously updated incidents and workbooks. using this tool we can ingest data that will trigger incidents in a schedule manner that customers can use to build demo scripts.
End2End testing for sentinel functionality – In addition, SIEM engineers can build monitoring around different product features by ingesting expected data into the system as part of schedule management.

We can test the following scenarios:

Delays in log ingestion

Functionality of the analytical rule engine

Creation of incidents

Scenario for automation (add automation role when incident and alert are created).

Solution components

Presentation layer – we use Azure Workbook. The user will point to the input file here and define the transformation.
Ingestion engine - we use Azure automation accounts, with 3 different runbooks.
Schema management- Azure functions are used as parser helpers to create a list of fields to replace.
Data collection rules - as part of the solution deployment, we create four DCRs (Data collection rules) for every built-in table, and if users choose to ingest data into custom tables, we create the DCRs on ingestion time and delete them after.

Deploying the Solution

To deploy this solution, logic with user with deployment permission and navigate to this GitHub repository and press Deploy.

On the Azure template deployment page review the above for inputs properties:

As soon as the installation is complete, enter the relevant resource group and review the new resources.

Follow the above diagram during the post-deployment phase to assign permissions to the two managed identities. Please note that the permission assignment may change if the solution is deployed in a different resource group than the target sentinel.

This is the post deployment needed permission:

Identity Type	Permission	Scope

Automation account Manage Identity	Automation Contributor	Workspace resource group
Automation account Manage Identity	Log analytics Contributor	Workspace resource group
Automation account Manage Identity	monitoring analytics Contributor	Workspace resource group
Automation account Manage Identity	monitoring metrics publisher	Solution RG
Azure function Manage Identity	Reader	Workspace resource group

We are ready to ingest some sample data!!

How to use the Tool:

When we open the workbook and approve the trusted zone notification, we see the above input properties. These input properties will be discussed in the section above

filePath:

The filePath properties will expect files from public GitHub repository or storage account (can be under SAS key protection)

Example location on GitHub can be https://raw.githubusercontent.com/Yaniv-Shasha/Sentinel/master/Sample_Data/scenarios/Security Event log cleared/1102_clearlogs.json

Example for Storage account file input:

https://stor44a2dgbbbrtx6.blob.core.windows.net/ingestionlogs/4611.json?sp=r&st=2022-06-14T17:21:48Z&se=2022-06-15T01:21:48Z&spr=https&sv=2021-06-08&sr=b&sig=67bDh7ky1GCWD1nxXD3xxxxJTPjmXRKeCqhXQZOomU%3D

FileFormat:

Depending on the input file define: CSV/JSON

TargetTable

Select destination Table.

Please note that input file schema needs to be aligned with the distention table schema to successfully ingest data

SecurityEvent
Syslog
WindowsEvent
CommonSecurity
ASimDansActivityLogs

Reference for table schema can be found here Azure Monitor table reference index by category | Microsoft Docs

For custom tables the target table schema can be defined with two options.

existedSchemaLink fields is not set – the table schema will be created directly from the input file. The downside for this is that all the fields will be created as strings and datatime
existedSchemaLink is define – user can point the tool to repository with schema files (like the above example Azure-Sentinel/.script/tests/KqlvalidationsTests/CustomTables at master · Azure/Azure-Sentinel (github.com) and the tool will create the DCR and the target table directly from this schema file.

Startdate

A startdate (aka TimeGenerated) is an important field, and in this section, we will share the different use cases around it.

startDate is empty – The solution will use the current datetime.

User define specific data on the startdate field – the solution only except iso8601 format like (yyyy-ddThh:mm:ss) aka 2022-05-22T14:20:20

** If the user specifies to ingest the data on a schedule, the solution will disregard time-generated fields and push the data at the nearest time, then it will use the schedule.

Select fields to replace

Users can overwrite data from the sample file using this option.

By using Azure Monitor's brand-new API, the solution lets users modify values before ingestion occurs.

The user must choose a column in the replace list for the ingestion to begin. If the user does not wish to overwrite a column, they need to select a column and not modify it.

On the above screenshot, select the Account column, and change the replacement value to a brand-new account name.

Ingestion scheduling:

One-time or periodic ingestion can be defined by users

1a. - One-time ingestion: The user must press the ingest button

1b. Scheduling – The user needs to select the recurring time range and press the schedule button.

Now it's time to simulate sample data and create great demos in Microsoft Sentinel!

Updated Aug 24, 2022

Version 4.0

YanivSh

Microsoft

Joined September 02, 2018

View Profile

Microsoft Sentinel Blog

Microsoft Sentinel is a cloud-native SIEM, enriched with AI and automation to provide expansive visibility across your digital environment.

14 Comments

Shay_K1335
Copper Contributor
Jun 25, 2024
Hi YanivSh,

I found your solution (Ingestion sample data as a service) a few weeks ago, and I think it is amazing, it's exactly what we are looking for in our detection engineering team for detection simulations.

I have followed your instructions, however, I do not see the sample in the dedicated table in Sentinel. The weird thing is that when I use the AkamaiSIEM sample (https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Sample Data/CEF/AkamaiSIEM.csv) it is showing in the CommonSecurityLog, however, any other sample does not.

I understand that it has been a while since you have developed this, but I was hoping that you, or someone else, might be able to have a quick look?
Troubleshooting steps I have tried:
1. I have made sure to test with different samples and use the related table (in alignment with the schema). I also tried to use the custom table option but with no success.
2. I have verified that the Runbook and Functionapp have the right permissions.
3. I have searched for errors in the CommonLogIngest jobs - no errors.

For example, see below success notification when trying to ingest a SecurityEvent sample you have linked in your article (the sample did not get to the SecurityEvent table in Sentinel).

I hope that you'll have the option to assist with this.
Thank you.
er-rahul01
Copper Contributor
May 08, 2024
Hi YanivSh,

I deployed the solution from Github. And followed the instruction. However, I am unable to see the incidents and data after deploying in Azure. Could you please look into it and confirm what's wrong? Your early response to this would be highly appreciated.
PS : I have deployed to an existing LogAnalytics Workspace.
YanivSh
Microsoft
Feb 15, 2023
Chris_Bode the solution already using, managed identity in the automation account to authenticate
Chris_Bode
Copper Contributor
Feb 14, 2023
Azure automation run as accounts are going EOL in September this year (https://learn.microsoft.com/en-us/azure/automation/automation-security-overview) Will there be an updated version that uses managed identities instead?
corteletti
Microsoft
Nov 04, 2022
YanivSh , thank you for this, it is very helpful. I think there is a typo on one of the roles on the small permissions table on the post:

Isn't "monitoring metrics publisher" supposed to be just "monitoring publisher" ?

Thank you
YanivSh
Microsoft
Sep 19, 2022
m5fisker thanks for sharing this, and let me know if you have any farther feedbacks on this tool!
m5fisker
Copper Contributor
Sep 18, 2022
YanivSh apologizes. I spent some more time troubleshooting it and solved the issue. It was related to the roles assigned to the automation account and the type of data that was being ingested. For any future readers, in case you see a successful PUT from the workbook but no data in the workspace, ensure that the roles have been assigned to the automation account and the type of data matches the table. (In my case I was using the default Akami sample data to the CommonSecurityLogs table but this did not work)

For troubleshooting, I would recommend checking the jobs of CommonLogIngest and ensuring that there are no errors when run.

YanivSh thanks again and apologizes for confusing the table name as I saw that the runbook splits the table name on "-".
YanivSh
Microsoft
Sep 18, 2022
m5fisker this is not releted to the table name.

what is the timegenerted vaule you added?

in the automation rule logs, you can see the full payload that you are sending, can you copy it here?
m5fisker
Copper Contributor
Sep 17, 2022
YanivSh Thanks for the response. I have attempted it again with all the options but it still does not seem to be working. The input table always ends up like this https://imgur.com/yZ7SmdO. After running the ingest, I cannot find the table within the log analytics workspace.
YanivSh
Microsoft
Sep 16, 2022
TenableIO and CommonSecurityLog prefixes are only added in the internal llogic for runbooks

The destination table name is the same as the one you defined