Automating IOC hunts in Microsoft Sentinel data lake

Microsoft

Nov 05, 2025

Security operations are undergoing significant transformation driven by the introduction of AI and a rapidly evolving threat landscape. With Microsoft Sentinel data lake now generally available, organizations can centralize all their security data in a purpose-built security data lake. This helps optimize costs, simplify data management, and accelerate the adoption of AI in security operations. This empowers defenders to transcend legacy security controls, adopting advanced analytics and automation for more dynamic and effective protection.

A key advantage of the Sentinel data lake is its cost-efficiency, making it ideal for ingesting and retaining large volumes of security logs, such as network logs, without incurring high expenses or compromising coverage. By storing all security data in a unified, cost-effective data lake, organizations gain comprehensive, long-term visibility for historical threat hunting and TI matching, enabling investigations across extended timelines without the prohibitive costs of traditional analytics solutions. In this blog we will explore how security teams can leverage KQL jobs in Sentinel data lake to automate threat hunting and threat intelligence matching across network logs, enabling scalable, cost-effective, and continuous threat detection. By doing so, SOCs can efficiently process large volumes of data and transform raw logs into actionable insights efficiently with minimal manual intervention.

What are KQL jobs?

KQL jobs in Sentinel data lake are automated one-time or scheduled jobs that run Kusto Query Language (KQL) queries on data lake. These jobs help security teams investigate and hunt for threats more easily by automating processes like checking logs against known threat data. By automating tasks such as IOC matching with historical or high-volume data, analysts are able to concentrate on higher-value activities. This results in more effective threat detection and response. The next section demonstrates how to use the data lake for Threat Intelligence (TI) matching across network logs.

IOC matching on network log on data lake

Network logs, such as firewall and proxy data, are essential for uncovering advanced threats and supporting investigations. However, storing all this data in the analytics tier is often expensive, leading to reduced retention and potential blind spots. With Sentinel data lake, SOCs can store all their raw telemetry, at a fraction of the cost, making it possible to hunt for threats across a much broader timeline without financial constraints. However, simply storing data isn’t enough. To turn raw logs into actionable insights, SOC teams need to automate both summarization and threat intelligence (TI) matching. Scheduled KQL jobs make this possible by scanning new data in a schedule as it arrives in the data lake, surfacing suspicious activity for analyst review.

Schedule KQL job for TI matching on network logs

Here’s a practical example of how a SOC can use a scheduled KQL job to summarize network activity and correlate it with threat intelligence indicators. In this scenario, a KQL job is run to identify network log entries from Palo Alto firewalls that match known malicious IPs from ThreatIntelIndicators table.

The output provides the complete network log row, enriched with relevant threat intelligence fields for further investigation and response.

Create your query:

let IPRegex = '[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}';
  let dt_lookBack = 75m; // Look back 1 hour for CommonSecurityLog events
  let ioc_lookBack = 14d; // Look back 14 days for threat intelligence indicators
  // Fetch threat intelligence indicators related to IP addresses
  let IP_Indicators = ThreatIntelIndicators
  //extract key part of kv pair
       | extend IndicatorType = replace(@"\[|\]|\""", "", tostring(split(ObservableKey, ":", 0)))
       | where IndicatorType in ("ipv4-addr", "ipv6-addr", "network-traffic")
       | extend NetworkSourceIP = toupper(ObservableValue)
       | extend TrafficLightProtocolLevel = tostring(parse_json(AdditionalFields).TLPLevel)
    | where TimeGenerated >= ago(ioc_lookBack)
    | extend TI_ipEntity = iff(isnotempty(NetworkSourceIP), NetworkSourceIP, NetworkSourceIP)
    | extend TI_ipEntity = iff(isempty(TI_ipEntity) and isnotempty(NetworkSourceIP), NetworkSourceIP, TI_ipEntity)
    | where ipv4_is_private(TI_ipEntity) == false and  TI_ipEntity !startswith "fe80" and TI_ipEntity !startswith "::" and TI_ipEntity !startswith "127."
    | summarize LatestIndicatorTime = arg_max(TimeGenerated, *) by Id, ObservableValue
    | where IsActive and (ValidUntil > now() or isempty(ValidUntil));
  // Perform a join between IP indicators and CommonSecurityLog events
  IP_Indicators
     | project-reorder *, Tags, TrafficLightProtocolLevel, NetworkSourceIP, TI_ipEntity
    // Use innerunique to keep performance fast and result set low, as we only need one match to indicate potential malicious activity that needs investigation
    | join kind=innerunique (
        CommonSecurityLog
        | where TimeGenerated >= ago(dt_lookBack)
        | extend MessageIP = extract(IPRegex, 0, Message)
        | extend CS_ipEntity = iff((not(ipv4_is_private(SourceIP)) and isnotempty(SourceIP)), SourceIP, DestinationIP)
        | extend CS_ipEntity = iff(isempty(CS_ipEntity) and isnotempty(MessageIP), MessageIP, CS_ipEntity)
        | extend CommonSecurityLog_TimeGenerated = TimeGenerated
    )
    on $left.TI_ipEntity == $right.CS_ipEntity
    // Filter out logs that occurred after the expiration of the corresponding indicator
    | where CommonSecurityLog_TimeGenerated < ValidUntil
    // Group the results by IndicatorId and CS_ipEntity, and keep the log entry with the latest timestamp
    | summarize CommonSecurityLog_TimeGenerated = arg_max(CommonSecurityLog_TimeGenerated, *) by Id, CS_ipEntity
    // Select the desired output fields
    | project timestamp = CommonSecurityLog_TimeGenerated, SourceIP, DestinationIP, MessageIP, Message, DeviceVendor, DeviceProduct, Id, ValidUntil, Confidence, TI_ipEntity, CS_ipEntity, LogSeverity, DeviceAction

Source: Microsoft Sentinel GitHub repo

Before submitting a KQL job you may want to test your query interactively, using the KQL queries page:

Create a KQL job:

To match against new logs periodically, we would like to schedule this job to run every hour to summarize network log and match against latest IOCs in ThreatInelIndicators.

To avoid missing any logs, I suggest adding an overlap between lookback and schedules, to make sure all logs are scanned. For example, you can set lookback of the last 75 minutes and execute job runs every 60 minutes.

KQL jobs can run ad-hoc or be scheduled based on your preferred frequency (by minutes, hourly, daily, weekly or monthly), automatically summarizing new network activity and highlighting matches with known malicious indicators. Analysts can then focus on the most relevant events, accelerating investigations and reducing noise.

Results are automatically available in the analytics tier and can be used to set up an automated detection using Analytics rules.

The cost of running KQL jobs in Sentinel data lake depends on the volume of data scanned and how frequently the jobs run. Data lake KQL queries and jobs are priced at $0.005 per GB scanned. For example, if a KQL job scans 1 TB of data daily, the monthly cost would be around $150 USD. This pricing model allows organizations to perform large-scale threat hunting and intelligence matching without the high expenses typically associated with traditional SIEMs. $0.005 per GB scanned.

For more details around Microsoft Sentinel data lake costs for KQL queries and jobs, see https://azure.microsoft.com/en-us/pricing/calculator.

Summary and next steps

Threat hunting at scale within Sentinel data lake is simplified with KQL jobs. SOC teams can use this method for various hunting or anomaly detection scenarios such as efficiently aggregating and correlating network logs with threat intelligence, enhancing visibility, agility, and assurance, and transforming raw telemetry into actionable security insights.

KQL jobs provide several benefits:

Continuous threat coverage: Scheduled jobs utilizing KQL automatically correlate high-volume logs located directly in the data lake with up-to-date threat intelligence. This process helps minimize detection gaps and blind spots.

Efficient use of resources: Automating TI matching saves analysts from repetitive queries, allowing them to focus on investigating validated alerts rather than sifting through raw logs.
Faster response times: Suspicious connections flagged by minutes or every hour enable quicker triage and containment before threats escalate.
Historical context: Matches are retained against long-term or high volume logs, enabling analysts to trace back patterns of malicious activity and support deeper investigations.

Get started with Microsoft Sentinel data lake today.

What's next?

Join us at Microsoft Ignite in San Francisco on November 17–21, or online, November 18–20, for deep dives and practical labs to help you maximize your Microsoft Defender investments and to get more from the Microsoft capabilities you already use. Security is a core focus at Ignite this year, with the Security Forum on November 17th, deep dive technical sessions, theater talks, and hands-on labs designed for security leaders and practitioners

Featured sessions

BRK237: Identity Under Siege: Modern ITDR from Microsoft
Join experts in Identity and Security to hear how Microsoft is streamlining collaboration across teams and helping customers better protect, detect, and respond to threats targeting your identity fabric.
BRK240 – Endpoint security in the AI era: What's new in Defender
Discover how Microsoft Defender’s AI-powered endpoint security empowers you to do more, better, faster.
BRK236 – Your SOC’s ally against cyber threats, Microsoft Defender Experts
See how Defender Experts detect, halt, and manage threats for you, with real-world outcomes and demos.
LAB541 – Defend against threats with Microsoft Defender
Get hands-on with Defender for Office 365 and Defender for Endpoint, from onboarding devices to advanced attack mitigation.

Explore and filter the full security catalog by topic, format, and role: aka.ms/SessionCatalogSecurity.

Why attend?
Ignite is the place to learn about the latest Defender capabilities, including new agentic AI integrations and unified threat protection. We will also share future-facing innovations in Defender, as part of our ongoing commitment to autonomous defense.

Security Forum—Make day 0 count (November 17)
Kick off with an immersive, in person preday focused on strategic security discussions and real-world guidance from Microsoft leaders and industry experts. Select Security Forum during registration.

Updated Nov 05, 2025

Version 2.0

microsoft sentinel

Zeinab Mokhtarian Koorabbasloo

Microsoft

Joined September 29, 2018

View Profile

Microsoft Sentinel Blog

Microsoft Sentinel is a cloud-native SIEM, enriched with AI and automation to provide expansive visibility across your digital environment.