Kusto: summarize consecutive segments of the same value per batch

Question

Problem statement (Azure Data Explorer / Kusto)

I have a table in Azure Data Explorer (Kusto) with the following columns:

batch_number (string)
team (string)
datetime (datetime)

Each row represents work done by a team on a batch at a given time.

The data is chronological per batch, but teams can change over time and may come back later.

Example data (ordered by batch_number, then datetime):

batch_number | datetime | team
001 | 2024-01-01 08:00:00 | A
001 | 2024-01-01 08:01:00 | A
001 | 2024-01-01 08:02:00 | A
001 | 2024-01-01 08:03:00 | B
001 | 2024-01-01 08:04:00 | B
001 | 2024-01-01 08:05:00 | A
001 | 2024-01-01 08:06:00 | A
002 | 2024-01-01 08:00:00 | A
002 | 2024-01-01 08:01:00 | A
002 | 2024-01-01 08:02:00 | B
002 | 2024-01-01 08:03:00 | C

Goal

For each batch, I want to group consecutive rows where the team stays the same and compute:

start_time = min(datetime)
end_time = max(datetime)

The result must preserve team sequences, not just distinct teams.

Expected result:

batch_number | team | start_time | end_time
001 | A | 2024-01-01 08:00:00 | 2024-01-01 08:02:00
001 | B | 2024-01-01 08:03:00 | 2024-01-01 08:04:00
001 | A | 2024-01-01 08:05:00 | 2024-01-01 08:06:00
002 | A | 2024-01-01 08:00:00 | 2024-01-01 08:01:00
002 | B | 2024-01-01 08:02:00 | 2024-01-01 08:02:00
002 | C | 2024-01-01 08:03:00 | 2024-01-01 08:03:00

Notes

Team A appears twice for batch 001 because it appears in two separate consecutive segments.
The grouping must be done per batch, not across batches.

Important constraints

A simple summarize by batch_number and team is not correct because it merges non-consecutive segments.
I can easily implement this logic in pandas using a cumulative sequence identifier, but I have not found a reliable equivalent in Kusto.

Question

What is the correct and reliable Kusto query to compute these consecutive team segments per batch?

lilleyb · Answer

Here’s a reliable and efficient Kusto query that solves the "gaps and islands" problem to group consecutive rows with the same team per batch, without merging non-consecutive segments:kustoYourTable
| order by batch_number asc, datetime asc
| serialize
| extend 
    prev_batch = prev(batch_number),
    prev_team = prev(team)
| extend 
    is_new_segment = 
        isnull(prev_team) 
        or (batch_number != prev_batch) 
        or (team != prev_team)
| extend 
    segment_id = row_cumsum(iff(is_new_segment, 1, 0))
| summarize 
    start_time = min(datetime),
    end_time = max(datetime),
    team = take_any(team)
    by batch_number, segment_id
| project batch_number, team, start_time, end_time
| order by batch_number asc, start_time ascExplanation:order by + serialize ensures the data is properly ordered and allows safe use of prev().We detect the start of a new segment whenever the batch changes or the team is different from the previous row.row_cumsum creates a unique segment ID for each consecutive group.summarize aggregates the min/max datetime per segment, keeping batch_number in the by clause to group correctly per batch.Final project and order by give the clean expected output.This produces exactly the desired result for your sample data, preserving separate segments for the same team when they are not consecutive (e.g., team A appears twice in batch 001)..............................................................................................https://jjewellery.vn/blogs/news/jjewellery-cua-hang-nhan-cuoi-uy-tin-tai-quan-1-tp-hcm

Forum Discussion

Kusto: summarize consecutive segments of the same value per batch

1 Reply