SOC analysts often face complex challenges during investigations. They often need to investigate incidents that span weeks or even months, not just hours or days. This requires correlating multiple high-volume data sources such as sign-ins, network traffic, endpoint activity, and logs. Analysts frequently need to revisit query results throughout an active investigation and must determine whether findings are temporary insights or should evolve into long-term detections. Hunting and anomaly detection, in particular, demand access to months of historical data to uncover patterns and validate threats.
With Microsoft Sentinel data lake, you can run KQL directly on data in the lake, unlocking flexibility for investigations and hunting. The key is choosing the right execution model for your scenario. Quick, interactive queries are great for initial exploration but may time out on complex queries or large datasets. Deep investigations require a more robust approach, while operationalized security outcomes call for scheduled jobs.
“How should I run my KQL queries on the data lake to balance between efficiency, and speed?”
If you’ve ever run a long query, waited… waited some more… and then hit a timeout, you’ve already met the limits of interactive KQL queries. The good news is that Sentinel data lake gives you multiple ways to query large volumes of security data, each designed for a different purpose.
In this post, we’ll walk through:
- When to use interactive KQL queries
- When to run to KQL Async queries
- When should you consider using KQL jobs
We'll explain how these options work in real SOC workflows and guide you in selecting the right KQL execution mode, so you can avoid unnecessary frustration and re-runs.
Understanding the three KQL options for Sentinel data lake
When working with Sentinel data lake, you’ll typically choose between three query execution modes:
- Interactive KQL queries
- Async KQL queries
- KQL jobs
They all use Kusto Query Language (KQL), but they differ in how long they run, how results are stored, and who they’re best for.
Let’s break them down.
1. Interactive KQL queries: Ad-hoc, faster access and temporary results
Analysts often begin with interactive queries because they are simple, fast, and ideal for exploring data.
When to use interactive queries
Use interactive queries when:
- You’re running queries for ad-hoc investigations
- You want to a validate a query before putting them in a KQL job for scheduling
- The dataset is small to moderate
- You only need to view results once
- You want immediate feedback
- Queries that are completed quickly (ideally within 2-3 minutes for an analyst to wait for the results interactively).
Common use cases
- Checking recent network logs or sign-in failures (e.g. last 24 hrs.)
- Exploring a suspicious IP over a short time window (e.g. last 24 hrs.)
- Verifying a hypothesis during triage
What to expect
- Designed for quick execution
- Best for short lookback periods
- Queries may time out between ~7–8 minutes
- Results do not persist beyond the session
Interactive queries are ideal for exploratory analysis, but they may not be ideal for heavy lifting across large datasets or long lookbacks of data in lake.
2. Async queries: Long-running investigations reducing the risk of timeout
An Async query is a new feature in data lake where things get interesting, especially for incident investigations involving larger datasets.
Instead of waiting for results in real time, Async queries run in the background and a user can check progress in Async tab. Results are in data lake hot cache for quicker retrieval on demand, up to 24 hours after execution.
When to use Async queries
Async queries are a great fit when:
- You’re querying larger datasets, or you need to query a longer lookback window.
- KQL Async queries can run for up to one hour, and you suspect your query would time out interactively
- A small group of analysts needs fast access to the same results
- You don’t want to hydrate data into the Analytics tier nor create use a custom table for that
- You don't need to use the results in your detection rules
Common use cases
- Exploring a suspicious IP over a time window
- Requiring data from lake for an incident investigation that multiple analysts need to access for a short period of time
Key benefits
- Queries can run for up to one hour
- Results are stored in data lake hot cache
- Results remain available for up to 24 hours
- Multiple users can fetch results
- Fast re-access without re-running the query on cold storage
- No need to permanently move data into Analytics tier
This makes Async queries especially useful during active incidents, where one or two SOC analysts may need to revisit results multiple times while pivoting their investigation.
KQL Jobs: For persistent results or custom detection rules
KQL jobs are designed for persistence and reuse.
Instead of just querying data, a KQL job hydrates results from the data lake into the Analytics tier, either as a one-time job or on a schedule.
When to use KQL jobs
Use KQL jobs when:
- You need the results long-term
- Data should be available for detections or dashboards
- You want to schedule recurring queries
- Multiple teams or detections depend on the output
- Summarization of logs from data lake into Analytics tier
Common use cases
- One-time: Incident investigations spanning larger datasets from lake
- Jobs: High volume log summarization, anomaly detection, IoC matching and similar use cases
- Use Microsoft Sentinel workbooks, building dashboards using data on top of analytics-tier
- Produce enriched datasets for ongoing monitoring
Important considerations
- KQL jobs can run for up to one hour in the background
- Results are stored permanently (unless the custom table is deleted)
- Best when query output becomes part of your proactive hunting process
Think of KQL jobs as turning raw lake data into a reusable security asset.
Putting it all together: A sample investigation scenario
Let’s walk through a realistic SOC scenario to see how these query types work together.
Scenario: Suspicious IP activity over 90 days
An analyst is investigating a potentially malicious IP address reported by threat intelligence. The IP may have been active over the past 90 days.
Step 1: Start Interactive
Let’s say you decided to store SecurityEvent logs in lake only mode.
An analyst in your team begins with an interactive KQL query on data lake to quickly check recent activity:
SecurityEvent
| where TimeGenerated > ago(24h)
| where IpAddress == "203.0.113.45"
| summarize count() by EventLevelName
This may run quickly and confirm suspicious activity, with a short lookback such as last 24 hours.
Step 2: Switch to Async for scale
To understand the full scope, the analyst expands the lookback to 90 days and joins multiple data sources for the same IoC. The query is may be too slow for an interactive execution.
So they run it as an Async query:
union SigninLogs, SecurityEvent, CommonSecurityLog
| where TimeGenerated > ago(90d)
| where IPAddress == "203.0.113.45"
| summarize
FirstSeen = min(TimeGenerated),
LastSeen = max(TimeGenerated),
EventCount = count()
by IPAddress, Category
After providing a name for query, the execution begins:
The query runs in the background, and it may take few minute. You can always check the progress In Async queries tab:
Once the query completes successfully, the results are cached in data lake. Over the next few hours, the analyst revisits the results multiple times while pivoting to related entities, without having to wait query execution on the cold storage.
Step 3: Operationalize with a KQL Job
The investigation reveals a recurring attack pattern that leadership wants monitored continuously.
A KQL job is created to:
- Run once or in a schedule (by minutes, daily, weekly, monthly)
- Hydrate results into a custom Analytics table
- With results in Analytics tier, power custom detection rules and dashboard
Now the insights move from investigation to ongoing defense. Read our previous blog posts on how to run KQL jobs on data lake.
How to Choose the Right Option
Here’s a simple way to decide:
- Need quick answers now?
→ Use Interactive queries - Query is big, slow, or spans long timeframes?
→ Use Async queries - Results must be reused, scheduled, or used in custom detection rules?
→ Use KQL jobs
Each option exists to reduce friction at a specific stage of analysis, from curiosity, to investigation, to operationalization.
Final thoughts
Microsoft Sentinel data lake gives security teams flexibility at scale, when the right query mode is used at the right time.
- Interactive queries keep investigations fast and exploratory.
- Async queries unlock deep, long-running analysis with higher time out limits.
- KQL jobs turn insights into durable security capabilities.
Still need to run a query on massive datasets or longer lookback? Try Notebooks capabilities on data lake.
Once you choose the right option for your scenarios, querying data in the lake becomes less about limits, and more about possibilities.
Happy hunting!
Resources
Get started with Microsoft Sentinel data lake and KQL today.
Microsoft Sentinel is a cloud-native SIEM, enriched with AI and automation to provide expansive visibility across your digital environment.