Forum Discussion

Silver Contributor

Jan 17, 2026

Solved

Excel Challenge - Pivoting poorly structured data

This is from an ExcelBI challenge. I thought it may be worth while posting my solution here as a demonstration of modern Excel methods.

Challenge

Like many of such challenges, the natural solution approach is to use BYROW but that creates the usual 'array of arrays' error.

Solution:

Gradually I am moving to a point at which I have no formulas showing in the workbook other than calls to Lambda functions. In this case, the worksheet formuloa is

= PIVOTBYCATEGORYλ(OrderTbl)

The function works row by row apportioning the amounts against the listed categories

PIVOTBYCATEGORYλ 
//  Groups and pivots table by category
= LAMBDA(table,
    LET(
        normalised, BYROWλ(table, APPORTIONλ),
    //  Identify fields from normalised table
        dimension,  TAKE(DROP(normalised,,1),,2),
        category,   TAKE(normalised,,1),
        partCost,   TAKE(normalised,,-1),
    //  Pivot by category
        return,     PIVOTBY(dimension, category, partCost, SUM,,0,,0),
        return
    )
);

The function APPORTIONλ divides the amount between categories so each record within the source data returns a number of rows

APPORTIONλ
// Splits by category and assigns costs
= LAMBDA(record,
    LET(
        category,   TOCOL(REGEXEXTRACT(INDEX(record,4),"\w+",1)),
        amount,     INDEX(record,3) / COUNTA(category),
        year,       YEAR(INDEX(record,1)),
        region,     IF(LEN(INDEX(record, 2)), INDEX(record, 2), "Unknown"),
        broadcast,  B∕CASTλ(HSTACK(region, year, amount), category),
        return,     HSTACK(category, broadcast),
        return
    )
);

/* FUNCTION NAME:  B∕CASTλ
     DESCRIPTION:  Broadcasts the terms of a vector over the shape of a second array */
B∕CASTλ = LAMBDA(vector, array, IF({1}, vector, array));

The key to making the formula work is the function BYROWλ that I wrote to generalise the inbuilt but over-restrictive BYROW function.

The PIVOTBY function returned the required crosstab from the normalised data array

BI & Data Analysis

excel

Formulas and Functions

office 365

Patrick2788

Jan 18, 2026

This won't win the award for shortest function. I didn't set out to make a generalized function, but a modular approach using existing functions in my library allowed me to pull this together quickly.

//-----------------------------------------------------------------------------------
//PivotbyCatλ
//-----------------------------------------------------------------------------------
// Author:  Patrick H
// Date:    1/18/2026
// Description:
//   Pivot by row fields (one of which is a date), categories, and values.
//   It's presumed the categories array must be split prior to aggregation so a
//   delimiter must be specified. With optional parameters for aggregation type (function)
//   and date display (date_format).
// Parameters:
//   row_detail        - 1D Array of non-dates.
//   row_dates         - 1D Array of dates.
//   categories        - 1D Array containing category information strung together with a delimiter.
//   values            - 1D Array of numbers.
//   delimiter         - Delimiter used in the categories array.
//
// Optional Parameters:
//   [date_format]     - Format code for dates array. If omitted, defaults to "yyy".
//   [function]        - ETA or Lambda function for aggregation. If omitted, defaults to SUM.
//
// Lambdas called:      Streakλ, RepeatRowsλ


PivotbyCatλ=
LAMBDA(
    row_detail,row_dates,categories,values,delimiter,[date_format],[function],

//Validate inputs
LET(
    //Dimensions
    r₁,            ROWS(row_detail),  r₂,     ROWS(row_dates),
    r₃,            ROWS(categories),  r₄,     ROWS(values),

    //Halting scenarios
    ShapeMismatch, OR(r₁ <> r₂,r₁ <> r₃,r₁ <> r₄),

    InvalidDates,  OR(AND(ISERROR(DATEVALUE(TEXT(row_dates,date_format))),
                          ISERROR(VALUE(TEXT(row_dates,date_format)))),
                      NOT(AND(ISNUMBER(row_dates)))),

    InvalidFn,     AND(NOT(ISOMITTED(function)),
                       ISERROR(BYROW({1,1;2,2},function))),

    scenarios,     VSTACK(ShapeMismatch,InvalidDates,InvalidFn),
    
    errors,        {"#DIMENSIONS!";"#DATES!";"#INVALID-FUNCTION!"},

    //Logic gate
    FILTER(errors,scenarios,

//Proceed
LET(
    //Optional parameters
    fnλ,        IF(ISOMITTED(function),SUM,function),
    date_format,IF(ISOMITTED(date_format),"yyy",date_format),

    //Local functions
    CountCatλ,  LAMBDA(row,COUNTA(TEXTSPLIT(row,delimiter))),
    Parseλ,     LAMBDA(array,instance,
                       TEXTBEFORE(TEXTAFTER(delimiter&array&delimiter,
                       delimiter,instance,,,array),delimiter,,,,categories)),

    //Categories in each row
    k,          BYROW(categories,CountCatλ),
    
    //Indices
    i,          RepeatRowsλ(SEQUENCE(r₁),k),
    j,          Streakλ(i),
    
    //Row fields
    a,          RepeatRowsλ(row_detail,k),
    detail,     IF(a= 0, "Unknown",a),
    dates,      RepeatRowsλ(TEXT(row_dates,date_format),k),
    sort_key,   IFERROR(DATEVALUE(dates),VALUE(dates)),
    row_fields, HSTACK(sort_key,detail,dates),

    //Column fields
    unpack,     RepeatRowsλ(categories,k),
    categories, Parseλ(unpack,j),

    //Value field
    values,     RepeatRowsλ(values / k,k),

    //Aggregation
    agg,        DROP(PIVOTBY(row_fields,categories,values,fnλ,,0,,0),,1),
    agg
    ))));



//-----------------------------------------------------------------------------------
//Streakλ
//-----------------------------------------------------------------------------------
// Author:  Patrick H
// Date:    12/22/25
// Description:
//   Calculate streaks of consecutive values within 1D or 2D arrays.
//
// Parameters:
//   array             - Array (1D or 2D).
//
// Optional Parameters:
//   [Horizontal?]     - For 1D arrays, calculate streaks column‑wise instead of row‑wise.
//   [Include_Symbol?] - TRUE to include the symbol with streak counts (e.g., W·1, W·2, L·1).

Streakλ=
LAMBDA(
    array,[Horizontal?],[Include_Symbol?],

// Handle scalar input
IF(AND(ROWS(array)=1,COLUMNS(array)=1),array,
LET(
    //Dimensions
    i,      ROWS(array),     j,COLUMNS(array),
    Is2D?,  AND(i>1,j>1),

    //Flatten if 2D array
    array,  IF(Is2D?,TOCOL(array),array),

    //Direction for stacking
    NoDir?, ISOMITTED(Horizontal?),
    fn,     IF(NoDir?,VSTACK,HSTACK),

    //Build true/false array for streak detection
    kept,   IF(NoDir?,DROP(array,-1),DROP(array,,-1)),
    bool,   array=fn("omit",kept),

    //Accumulate streak counts
    acc,    SCAN(0,bool,LAMBDA(a,v,IF(a=0,1,IF(v,a+1,1)))),

    //Optionally include symbol with streak counts
    result, IF(ISOMITTED(Include_Symbol?),acc,
                                          array&UNICHAR(183)&acc),
    
    //Wrap result back to 2D if original array was 2D
    deliver,IF(Is2D?,WRAPROWS(result,j),result),

    deliver
)));

//-----------------------------------------------------------------------------------
//RepeatRowsλ
//-----------------------------------------------------------------------------------
//Author:   Patrick H.
//Date:   	11/7/2025
//Version:	1.0

//Description:
//Repeat each row in the supplied array by a corresponding repeat count.
//Arrays and scalars are supported.
//-----------------------------------------------------------------------------------
//Parameters:
//array    -    array to be repeated
//repeat   -    1D array of repeat integers

RepeatRowsλ =
LAMBDA(
    array,
    repeat,

//Check inputs
IF(OR(ISTEXT(repeat),repeat<=0),#VALUE!,

LET(
    //Flatten input
    repeat,  TOCOL(repeat),

    //Dimensions and row indexing
    A↕,      ROWS(array),  R↕,ROWS(repeat),

    r,       IF(A↕<>R↕,EXPAND(repeat,A↕,,@TAKE(repeat,-1)),
                 repeat),

    i,       SEQUENCE(ROWS(r)),

    m,       MAX(r),

    idx,     LAMBDA(TOCOL(IF(SIGN(r-SEQUENCE(,m,0))=1,i,NA()),2)),

    //Unwrap idx but defer delivery until function invocation
    deliver, LAMBDA(CHOOSEROWS(array,idx())),

    deliver
))());

Pivot by row fields (one of which is a date), categories, and values.

PH - PQ_Challenge_357.xlsx45 KB

7 Replies

PeterBartholomew1
Silver Contributor
Jan 21, 2026
Some amazing solutions!
All of which serve to demonstrate that Excel has moved lightyears from where is used to be.
How long before such solutions are overtaken by AI, I have no idea. Mind you, that may be where I delete 365 from my computer and find other interests. You never know though. It is strange to think that mathematical notation was developed to overcome to ambiguity of natural language and here we are hell-bent on reversing the process.

Patrick2788

Silver Contributor

Jan 18, 2026

//-----------------------------------------------------------------------------------
//PivotbyCatλ
//-----------------------------------------------------------------------------------
// Author:  Patrick H
// Date:    1/18/2026
// Description:
//   Pivot by row fields (one of which is a date), categories, and values.
//   It's presumed the categories array must be split prior to aggregation so a
//   delimiter must be specified. With optional parameters for aggregation type (function)
//   and date display (date_format).
// Parameters:
//   row_detail        - 1D Array of non-dates.
//   row_dates         - 1D Array of dates.
//   categories        - 1D Array containing category information strung together with a delimiter.
//   values            - 1D Array of numbers.
//   delimiter         - Delimiter used in the categories array.
//
// Optional Parameters:
//   [date_format]     - Format code for dates array. If omitted, defaults to "yyy".
//   [function]        - ETA or Lambda function for aggregation. If omitted, defaults to SUM.
//
// Lambdas called:      Streakλ, RepeatRowsλ


PivotbyCatλ=
LAMBDA(
    row_detail,row_dates,categories,values,delimiter,[date_format],[function],

//Validate inputs
LET(
    //Dimensions
    r₁,            ROWS(row_detail),  r₂,     ROWS(row_dates),
    r₃,            ROWS(categories),  r₄,     ROWS(values),

    //Halting scenarios
    ShapeMismatch, OR(r₁ <> r₂,r₁ <> r₃,r₁ <> r₄),

    InvalidDates,  OR(AND(ISERROR(DATEVALUE(TEXT(row_dates,date_format))),
                          ISERROR(VALUE(TEXT(row_dates,date_format)))),
                      NOT(AND(ISNUMBER(row_dates)))),

    InvalidFn,     AND(NOT(ISOMITTED(function)),
                       ISERROR(BYROW({1,1;2,2},function))),

    scenarios,     VSTACK(ShapeMismatch,InvalidDates,InvalidFn),
    
    errors,        {"#DIMENSIONS!";"#DATES!";"#INVALID-FUNCTION!"},

    //Logic gate
    FILTER(errors,scenarios,

//Proceed
LET(
    //Optional parameters
    fnλ,        IF(ISOMITTED(function),SUM,function),
    date_format,IF(ISOMITTED(date_format),"yyy",date_format),

    //Local functions
    CountCatλ,  LAMBDA(row,COUNTA(TEXTSPLIT(row,delimiter))),
    Parseλ,     LAMBDA(array,instance,
                       TEXTBEFORE(TEXTAFTER(delimiter&array&delimiter,
                       delimiter,instance,,,array),delimiter,,,,categories)),

    //Categories in each row
    k,          BYROW(categories,CountCatλ),
    
    //Indices
    i,          RepeatRowsλ(SEQUENCE(r₁),k),
    j,          Streakλ(i),
    
    //Row fields
    a,          RepeatRowsλ(row_detail,k),
    detail,     IF(a= 0, "Unknown",a),
    dates,      RepeatRowsλ(TEXT(row_dates,date_format),k),
    sort_key,   IFERROR(DATEVALUE(dates),VALUE(dates)),
    row_fields, HSTACK(sort_key,detail,dates),

    //Column fields
    unpack,     RepeatRowsλ(categories,k),
    categories, Parseλ(unpack,j),

    //Value field
    values,     RepeatRowsλ(values / k,k),

    //Aggregation
    agg,        DROP(PIVOTBY(row_fields,categories,values,fnλ,,0,,0),,1),
    agg
    ))));



//-----------------------------------------------------------------------------------
//Streakλ
//-----------------------------------------------------------------------------------
// Author:  Patrick H
// Date:    12/22/25
// Description:
//   Calculate streaks of consecutive values within 1D or 2D arrays.
//
// Parameters:
//   array             - Array (1D or 2D).
//
// Optional Parameters:
//   [Horizontal?]     - For 1D arrays, calculate streaks column‑wise instead of row‑wise.
//   [Include_Symbol?] - TRUE to include the symbol with streak counts (e.g., W·1, W·2, L·1).

Streakλ=
LAMBDA(
    array,[Horizontal?],[Include_Symbol?],

// Handle scalar input
IF(AND(ROWS(array)=1,COLUMNS(array)=1),array,
LET(
    //Dimensions
    i,      ROWS(array),     j,COLUMNS(array),
    Is2D?,  AND(i>1,j>1),

    //Flatten if 2D array
    array,  IF(Is2D?,TOCOL(array),array),

    //Direction for stacking
    NoDir?, ISOMITTED(Horizontal?),
    fn,     IF(NoDir?,VSTACK,HSTACK),

    //Build true/false array for streak detection
    kept,   IF(NoDir?,DROP(array,-1),DROP(array,,-1)),
    bool,   array=fn("omit",kept),

    //Accumulate streak counts
    acc,    SCAN(0,bool,LAMBDA(a,v,IF(a=0,1,IF(v,a+1,1)))),

    //Optionally include symbol with streak counts
    result, IF(ISOMITTED(Include_Symbol?),acc,
                                          array&UNICHAR(183)&acc),
    
    //Wrap result back to 2D if original array was 2D
    deliver,IF(Is2D?,WRAPROWS(result,j),result),

    deliver
)));

//-----------------------------------------------------------------------------------
//RepeatRowsλ
//-----------------------------------------------------------------------------------
//Author:   Patrick H.
//Date:   	11/7/2025
//Version:	1.0

//Description:
//Repeat each row in the supplied array by a corresponding repeat count.
//Arrays and scalars are supported.
//-----------------------------------------------------------------------------------
//Parameters:
//array    -    array to be repeated
//repeat   -    1D array of repeat integers

RepeatRowsλ =
LAMBDA(
    array,
    repeat,

//Check inputs
IF(OR(ISTEXT(repeat),repeat<=0),#VALUE!,

LET(
    //Flatten input
    repeat,  TOCOL(repeat),

    //Dimensions and row indexing
    A↕,      ROWS(array),  R↕,ROWS(repeat),

    r,       IF(A↕<>R↕,EXPAND(repeat,A↕,,@TAKE(repeat,-1)),
                 repeat),

    i,       SEQUENCE(ROWS(r)),

    m,       MAX(r),

    idx,     LAMBDA(TOCOL(IF(SIGN(r-SEQUENCE(,m,0))=1,i,NA()),2)),

    //Unwrap idx but defer delivery until function invocation
    deliver, LAMBDA(CHOOSEROWS(array,idx())),

    deliver
))());

Pivot by row fields (one of which is a date), categories, and values.

PH - PQ_Challenge_357.xlsx45 KB

Lorenzo

Silver Contributor

Jan 19, 2026

PQ option

Abuses of Table.ReplaceValue (can cause perf. issue on large table)
Doesn't use Table.Pivot

// TableSource
let
    Source = Excel.CurrentWorkbook(){[Name="TableSource"]}[Content],
    ChangedType = Table.TransformColumnTypes( Source,
        { {"Amount", type number} }
    )
in
    ChangedType


// TablePivoted
let
    Source = TableSource,
    ReplacedOrderDateAndRegion = Table.ReplaceValue( Source, each [OrderDate], "Unknown",
        // Works with the given source table. Otherwise would also replace null with "Unknown" in [OrderDate]
        (x,y,z) => if x is null then z else Date.Year( DateTime.Date(x) ),
        {"OrderDate", "Region"}
    ),
    FlaggedMultiCategories = Table.AddColumn( ReplacedOrderDateAndRegion, "Flag",
        each Text.Contains( [Categories], ";" ), type logical
    ),
    SplittedCategoriesAndAmount = Table.Combine(
        {
            fxFilterTable( FlaggedMultiCategories, "Flag", false ),

            Table.ExpandListColumn(
                Table.ReplaceValue(
                    fxFilterTable( FlaggedMultiCategories, "Flag", true ),
                    each [Amount], each Text.Split( [Categories], ";" ),
                    (x,y,z) => if Value.Type(x) = type number then x / List.Count(z) else z,
                    {"Amount", "Categories"}
                ),
                "Categories"
            )
        }
    ),
    TotalAmounts = Table.Group( SplittedCategoriesAndAmount,
        {"OrderDate", "Region", "Categories"},
        {"Amount", each List.Sum( [Amount] ), type number}
    ),
    DistinctCategories = List.Buffer(
        List.Sort( List.Distinct( Table.Column( TotalAmounts, "Categories") ) )
    ),
    PivotedCategories = Table.Group( TotalAmounts,
        {"Region", "OrderDate"},
        { "Pivoted",
            (nested_table) => fxPivotTable( nested_table, "Categories", "Amount"),
            fxTypeForTable( DistinctCategories, type nullable number)
        }
    ),
    SortedRegionAndOrderDate = Table.Sort( PivotedCategories,
        { {"Region", Order.Ascending}, {"OrderDate", Order.Ascending} }
    ),
    ExpandedPivotedTables = Table.ExpandTableColumn( SortedRegionAndOrderDate, "Pivoted",
        DistinctCategories
    ),
    ReplacedNullAmounts = Table.ReplaceValue( ExpandedPivotedTables, null, 0,
        Replacer.ReplaceValue, DistinctCategories
    ),
    TypedOtherColumns = Table.TransformColumnTypes( ReplacedNullAmounts,
        { {"Region", type text}, {"OrderDate", Int64.Type} }
    )
in
    TypedOtherColumns


// fxFilterTable
(table as table, filter_field as text, filter_value as any) as table =>
Table.RemoveColumns(
    Table.SelectRows( table, each Record.Field( _, filter_field ) = filter_value ),
    filter_field
)


// fxPivotTable
(
    table as table,
    optional attributeColumn as nullable text,
    optional valueColumn as nullable text
) as table =>
    Table.Transpose(
        Table.SelectColumns( table, if valueColumn is null then "Value" else valueColumn ),
        Table.Column( table, if attributeColumn is null then "Attribute" else attributeColumn )
    )


// fxTypeForTable
(List as list, Type as type) as type =>
    type table Type.ForRecord(
        List.Accumulate( List, [], 
            (state,current) =>
                Record.AddField( state, current, [Type = Type, Optional=false] )
        ),
        false
    )

PQ_Challenge_Pivoting_poorly_structured_data.xlsx41 KB

djclements
Silver Contributor
Jan 17, 2026
Good stuff, Peter! Nice demonstration of the power of your custom function collection for evaluating nested arrays.
I see the majority of the other dynamic array solutions offered up in the comments section of that LinkedIn post use some variation of DROP-REDUCE-VSTACK to perform each step on a row-by-row basis. In this particular situation, however, it is possible to conduct a TOCOL-broadcasting method (unpivot) over the entire table and pass the normalized fields directly to PIVOTBY:
=LET( num, LEN(Categories)-LEN(SUBSTITUTE(Categories,";",))+1, arr, TEXTBEFORE(TEXTAFTER(";"&Categories&";",";",SEQUENCE(,MAX(num))),";"), fnλ, LAMBDA(err,LAMBDA(vec,TOCOL(IF(err,arr,vec),2)))(ISERROR(arr)), PIVOTBY(HSTACK(fnλ(IF(ISBLANK(Region),"Unknown",Region)),fnλ(YEAR(OrderDate))),TOCOL(arr,2),fnλ(Amount/num),SUM,0,0,,0) )
Or with CHOOSEROWS and a single broadcast of the table row IDs:
=LET( num, LEN(Categories)-LEN(SUBSTITUTE(Categories,";",))+1, arr, TEXTBEFORE(TEXTAFTER(";"&Categories&";",";",SEQUENCE(,MAX(num))),";"), fnλ, LAMBDA(rId,LAMBDA(vec,CHOOSEROWS(vec,rId)))(TOCOL(IF(ISERROR(arr),arr,SEQUENCE(ROWS(arr))),2)), PIVOTBY(HSTACK(fnλ(IF(ISBLANK(Region),"Unknown",Region)),fnλ(YEAR(OrderDate))),TOCOL(arr,2),fnλ(Amount/num),SUM,0,0,,0) )
Kind regards.
- PeterBartholomew1
  Silver Contributor
  Jan 18, 2026
  David, Interesting solutions! I think that, where possible, obtaining 2D arrays by broadcasting vectors is more efficient than resorting to arrays of functions. Against that, I have been trying to avoid idiosyncratic solutions that are specific to a problem and also to increase my use of arrays of functions (so called 1st class citizens of the new environment). The trouble is that I am so focussed on arrays of functions as the base element of all calculation that I sometimes overlook broadcasting where it is applicable.
NikolinoDE
Platinum Contributor
Jan 17, 2026
This is a very solid—and very modern—Excel solution. You’ve essentially built a small, well-thought-out transformation pipeline using LAMBDA functions.
The way you normalize the data first and only then pivot is especially elegant, and your custom BYROW approach neatly sidesteps one of Excel’s more frustrating limitations.
A great example of how far “formula-only” Excel has come, and a very instructive solution to the challenge.
Thank you for sharing it with us.
PeterBartholomew1
Silver Contributor
Jan 17, 2026
The interface will not let me back to correct typos and has stripped out the workbook. Sorry about that!
PQ_Challenge_357.xlsx35 KB