Overview:
This Sentinel notebook was created to help identify anomalous processes that have successful network connections in your environment. The notebook uses the IsolationForest algorithm from scikit learn to identify these anomalies. The notebook authenticates to your Sentinel instance using msticpyconfig.yaml, and environment variables. It is important to make sure these are configured before running this notebook. After the anomalies are identified, there are further threat hunting steps provided to help determine if the anomalies could be malicious.
Microsoft Azure GitHub: https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/machine-learning-notebooks/Guided%20Hunting%20-%20Anomalous%20Process%20Network%20Connections.ipynb
Prerequisites:
- Python Version: Python 3.8 (including Python 3.8 - AzureML)
- Required Packages: msticpy, pandas, numpy, matplotlib, plotly, ipywidgets, ipython, sklearn
- Data Sources Required: Log Analytics – DeviceNetworkEvents
- msticpyconfig.yaml has been properly configured
- Registered application has been created with API permissions given to Log Analytics API
- Key vault set up with a secret to the Registered Application
How to Run the Notebook:
Setup Environment Variables
After you have all the prerequisite requirements, the Guided Hunting – Anomalous Process Network Connections can now be run and used for threat hunting. First you need to make sure you fill out the following environment variables in the “Setup Environment Variables” section:
- AZURE_TENANT_ID
- AZURE_CLIENT_ID
- key_vault_name
- key_vault_url
- secret_name
Verify Environment Variables
Hit the “Run Cell” on the left-hand side and if it runs correctly, you will see “Environment Variables have been set successfully” at the bottom of the page.
Setup msticpyconfig.yaml
The next code block just verifies that your msticpyconfig.yaml is set up correctly. Just run the cell and you should see that YAML file be verified below the code block. It is important to make sure the msticpyconfig.yaml is in the same directory as this notebook. If it is in a different directory, you will have to add the absolute path to msticpyconfig.yaml in the “mp_conf” variable.
Setup QueryProvider
This code block sets up the variable “qry_prov” as our QueryProvider for our Sentinel Instance. Nothing to add here, just run the code block.
Connect to Sentinel
This code block connects us to Sentinel to query data. Run the code block and you should see “connected” after running it.
Run Anomaly Detection Script – Anomalous Processes
This code block will allow us to analyze and find anomalous processes in our environment. First, you will need to run the code block. Once the code block runs, you will get 4 text widgets and 3 buttons appear below the code block. The “Query” text widget will pre-populate with the necessary KQL to run your search. This can be changed if you want to customize it, but keep the formatting the same. The “Field” text widget will be the field that the IsolationForest algorithm will be run against. The “InitiatingProcessFileName” prepopulates here does not need to be changed. Then you will hit the “Analyze” button.
Once you press the “Analyze” button, a dataframe will populate below with your results. There will be an “Anomaly” field that generates in the dataframe on the far right-hand side. If the value in the “Anomaly” column is “-1”, it is deemed an anomaly by the algorithm. To make this easier to search the dataframe, text widgets were built in. In the “Column” text widget, enter “Anomaly” and in the “Value” text widget, enter “-1”. These are case sensitive. Once this is done, select the “Search” button below. This will populate a dataframe below with all of your anomalies.
There is also an option to graph your most significant anomalies. If you select “Graph Results”, this will graph your most significant anomalies based on the “Anomaly Score”. There is an option to hover over the results as well to view more information about the occurrence.
What to do with this Information
Once you have found some anomalous processes in your network, it is time to investigate them further. Follow the next steps here to do that. I would recommend choosing 2-3 of the most anomalous processes from the graph above to start your investigation.
Verify Parent Process
It is common to see a malicious process spawn from normal process. You can check the anomalous processes that were identified to see if there is anything unusual with the parent process of the original anomalous process. Replace process1.exe, process2.exe, and process3.exe with the names of the anomalous processes.
Check if Process Spawned out of Temp File Path
Attackers commonly use a TEMP folder to spawn malicious processes. Ensure the anomalous process did not spawn out of this direction. Replace process1.exe, process2.exe, and process3.exe with the names of the anomalous processes.
Check if cmd.exe or PowerShell was used
Actors will sometimes use remote code execution with cmd.exe or PowerShell in coordination with other processes. The following KQL will verify this. Replace process1.exe, process2.exe, and process3.exe with the names of the anomalous processes.
Conclusion
This notebook is designed to help you find anomalous processes in your network. These processes are not always malicious but could be worth investigating. The threat hunting steps below do not confirm if something is malicious, but they are techniques that are commonly used by threat actors that could help aid your investigation.