Forum Discussion

PeterBartholomew1's avatar
PeterBartholomew1
Silver Contributor
Jan 17, 2026
Solved

Excel Challenge - Pivoting poorly structured data

This is from an ExcelBI challenge.  I thought it may be worth while posting my solution here as a demonstration of modern Excel methods.

Challenge

Like many of such challenges, the natural solution approach is to use BYROW but that creates the usual 'array of arrays' error.

Solution:

Gradually I am moving to a point at which I have no formulas showing in the workbook other than calls to Lambda functions.  In this case, the worksheet formuloa is

= PIVOTBYCATEGORYλ(OrderTbl)

The function works row by row apportioning the amounts against the listed categories 

PIVOTBYCATEGORYλ 
//  Groups and pivots table by category
= LAMBDA(table,
    LET(
        normalised, BYROWλ(table, APPORTIONλ),
    //  Identify fields from normalised table
        dimension,  TAKE(DROP(normalised,,1),,2),
        category,   TAKE(normalised,,1),
        partCost,   TAKE(normalised,,-1),
    //  Pivot by category
        return,     PIVOTBY(dimension, category, partCost, SUM,,0,,0),
        return
    )
);

The function APPORTIONλ divides the amount between categories so each record within the source data returns a number of rows

APPORTIONλ
// Splits by category and assigns costs
= LAMBDA(record,
    LET(
        category,   TOCOL(REGEXEXTRACT(INDEX(record,4),"\w+",1)),
        amount,     INDEX(record,3) / COUNTA(category),
        year,       YEAR(INDEX(record,1)),
        region,     IF(LEN(INDEX(record, 2)), INDEX(record, 2), "Unknown"),
        broadcast,  B∕CASTλ(HSTACK(region, year, amount), category),
        return,     HSTACK(category, broadcast),
        return
    )
);
/* FUNCTION NAME:  B∕CASTλ
     DESCRIPTION:  Broadcasts the terms of a vector over the shape of a second array */
B∕CASTλ = LAMBDA(vector, array, IF({1}, vector, array));

The key to making the formula work is the function BYROWλ that I wrote to generalise the inbuilt but over-restrictive BYROW function.

The PIVOTBY function returned the required crosstab from the  normalised data array

 

  • This won't win the award for shortest function.  I didn't set out to make a generalized function, but a modular approach using existing functions in my library allowed me to pull this together quickly.

    //-----------------------------------------------------------------------------------
    //PivotbyCatλ
    //-----------------------------------------------------------------------------------
    // Author:  Patrick H
    // Date:    1/18/2026
    // Description:
    //   Pivot by row fields (one of which is a date), categories, and values.
    //   It's presumed the categories array must be split prior to aggregation so a
    //   delimiter must be specified. With optional parameters for aggregation type (function)
    //   and date display (date_format).
    // Parameters:
    //   row_detail        - 1D Array of non-dates.
    //   row_dates         - 1D Array of dates.
    //   categories        - 1D Array containing category information strung together with a delimiter.
    //   values            - 1D Array of numbers.
    //   delimiter         - Delimiter used in the categories array.
    //
    // Optional Parameters:
    //   [date_format]     - Format code for dates array. If omitted, defaults to "yyy".
    //   [function]        - ETA or Lambda function for aggregation. If omitted, defaults to SUM.
    //
    // Lambdas called:      Streakλ, RepeatRowsλ
    
    
    PivotbyCatλ=
    LAMBDA(
        row_detail,row_dates,categories,values,delimiter,[date_format],[function],
    
    //Validate inputs
    LET(
        //Dimensions
        r₁,            ROWS(row_detail),  r₂,     ROWS(row_dates),
        r₃,            ROWS(categories),  r₄,     ROWS(values),
    
        //Halting scenarios
        ShapeMismatch, OR(r₁ <> r₂,r₁ <> r₃,r₁ <> r₄),
    
        InvalidDates,  OR(AND(ISERROR(DATEVALUE(TEXT(row_dates,date_format))),
                              ISERROR(VALUE(TEXT(row_dates,date_format)))),
                          NOT(AND(ISNUMBER(row_dates)))),
    
        InvalidFn,     AND(NOT(ISOMITTED(function)),
                           ISERROR(BYROW({1,1;2,2},function))),
    
        scenarios,     VSTACK(ShapeMismatch,InvalidDates,InvalidFn),
        
        errors,        {"#DIMENSIONS!";"#DATES!";"#INVALID-FUNCTION!"},
    
        //Logic gate
        FILTER(errors,scenarios,
    
    //Proceed
    LET(
        //Optional parameters
        fnλ,        IF(ISOMITTED(function),SUM,function),
        date_format,IF(ISOMITTED(date_format),"yyy",date_format),
    
        //Local functions
        CountCatλ,  LAMBDA(row,COUNTA(TEXTSPLIT(row,delimiter))),
        Parseλ,     LAMBDA(array,instance,
                           TEXTBEFORE(TEXTAFTER(delimiter&array&delimiter,
                           delimiter,instance,,,array),delimiter,,,,categories)),
    
        //Categories in each row
        k,          BYROW(categories,CountCatλ),
        
        //Indices
        i,          RepeatRowsλ(SEQUENCE(r₁),k),
        j,          Streakλ(i),
        
        //Row fields
        a,          RepeatRowsλ(row_detail,k),
        detail,     IF(a= 0, "Unknown",a),
        dates,      RepeatRowsλ(TEXT(row_dates,date_format),k),
        sort_key,   IFERROR(DATEVALUE(dates),VALUE(dates)),
        row_fields, HSTACK(sort_key,detail,dates),
    
        //Column fields
        unpack,     RepeatRowsλ(categories,k),
        categories, Parseλ(unpack,j),
    
        //Value field
        values,     RepeatRowsλ(values / k,k),
    
        //Aggregation
        agg,        DROP(PIVOTBY(row_fields,categories,values,fnλ,,0,,0),,1),
        agg
        ))));
    
    
    
    //-----------------------------------------------------------------------------------
    //Streakλ
    //-----------------------------------------------------------------------------------
    // Author:  Patrick H
    // Date:    12/22/25
    // Description:
    //   Calculate streaks of consecutive values within 1D or 2D arrays.
    //
    // Parameters:
    //   array             - Array (1D or 2D).
    //
    // Optional Parameters:
    //   [Horizontal?]     - For 1D arrays, calculate streaks column‑wise instead of row‑wise.
    //   [Include_Symbol?] - TRUE to include the symbol with streak counts (e.g., W·1, W·2, L·1).
    
    Streakλ=
    LAMBDA(
        array,[Horizontal?],[Include_Symbol?],
    
    // Handle scalar input
    IF(AND(ROWS(array)=1,COLUMNS(array)=1),array,
    LET(
        //Dimensions
        i,      ROWS(array),     j,COLUMNS(array),
        Is2D?,  AND(i>1,j>1),
    
        //Flatten if 2D array
        array,  IF(Is2D?,TOCOL(array),array),
    
        //Direction for stacking
        NoDir?, ISOMITTED(Horizontal?),
        fn,     IF(NoDir?,VSTACK,HSTACK),
    
        //Build true/false array for streak detection
        kept,   IF(NoDir?,DROP(array,-1),DROP(array,,-1)),
        bool,   array=fn("omit",kept),
    
        //Accumulate streak counts
        acc,    SCAN(0,bool,LAMBDA(a,v,IF(a=0,1,IF(v,a+1,1)))),
    
        //Optionally include symbol with streak counts
        result, IF(ISOMITTED(Include_Symbol?),acc,
                                              array&UNICHAR(183)&acc),
        
        //Wrap result back to 2D if original array was 2D
        deliver,IF(Is2D?,WRAPROWS(result,j),result),
    
        deliver
    )));
    
    //-----------------------------------------------------------------------------------
    //RepeatRowsλ
    //-----------------------------------------------------------------------------------
    //Author:   Patrick H.
    //Date:   	11/7/2025
    //Version:	1.0
    
    //Description:
    //Repeat each row in the supplied array by a corresponding repeat count.
    //Arrays and scalars are supported.
    //-----------------------------------------------------------------------------------
    //Parameters:
    //array    -    array to be repeated
    //repeat   -    1D array of repeat integers
    
    RepeatRowsλ =
    LAMBDA(
        array,
        repeat,
    
    //Check inputs
    IF(OR(ISTEXT(repeat),repeat<=0),#VALUE!,
    
    LET(
        //Flatten input
        repeat,  TOCOL(repeat),
    
        //Dimensions and row indexing
        A↕,      ROWS(array),  R↕,ROWS(repeat),
    
        r,       IF(A↕<>R↕,EXPAND(repeat,A↕,,@TAKE(repeat,-1)),
                     repeat),
    
        i,       SEQUENCE(ROWS(r)),
    
        m,       MAX(r),
    
        idx,     LAMBDA(TOCOL(IF(SIGN(r-SEQUENCE(,m,0))=1,i,NA()),2)),
    
        //Unwrap idx but defer delivery until function invocation
        deliver, LAMBDA(CHOOSEROWS(array,idx())),
    
        deliver
    ))());

    Pivot by row fields (one of which is a date), categories, and values.

     

7 Replies

  • Some amazing solutions!

    All of which serve to demonstrate that Excel has moved lightyears from where is used to be.

    How long before such solutions are overtaken by AI, I have no idea.  Mind you, that may be where I delete 365 from my computer and find other interests.  You never know though.  It is strange to think that mathematical notation was developed to overcome to ambiguity of natural language and here we are hell-bent on reversing the process.

  • Patrick2788's avatar
    Patrick2788
    Silver Contributor

    This won't win the award for shortest function.  I didn't set out to make a generalized function, but a modular approach using existing functions in my library allowed me to pull this together quickly.

    //-----------------------------------------------------------------------------------
    //PivotbyCatλ
    //-----------------------------------------------------------------------------------
    // Author:  Patrick H
    // Date:    1/18/2026
    // Description:
    //   Pivot by row fields (one of which is a date), categories, and values.
    //   It's presumed the categories array must be split prior to aggregation so a
    //   delimiter must be specified. With optional parameters for aggregation type (function)
    //   and date display (date_format).
    // Parameters:
    //   row_detail        - 1D Array of non-dates.
    //   row_dates         - 1D Array of dates.
    //   categories        - 1D Array containing category information strung together with a delimiter.
    //   values            - 1D Array of numbers.
    //   delimiter         - Delimiter used in the categories array.
    //
    // Optional Parameters:
    //   [date_format]     - Format code for dates array. If omitted, defaults to "yyy".
    //   [function]        - ETA or Lambda function for aggregation. If omitted, defaults to SUM.
    //
    // Lambdas called:      Streakλ, RepeatRowsλ
    
    
    PivotbyCatλ=
    LAMBDA(
        row_detail,row_dates,categories,values,delimiter,[date_format],[function],
    
    //Validate inputs
    LET(
        //Dimensions
        r₁,            ROWS(row_detail),  r₂,     ROWS(row_dates),
        r₃,            ROWS(categories),  r₄,     ROWS(values),
    
        //Halting scenarios
        ShapeMismatch, OR(r₁ <> r₂,r₁ <> r₃,r₁ <> r₄),
    
        InvalidDates,  OR(AND(ISERROR(DATEVALUE(TEXT(row_dates,date_format))),
                              ISERROR(VALUE(TEXT(row_dates,date_format)))),
                          NOT(AND(ISNUMBER(row_dates)))),
    
        InvalidFn,     AND(NOT(ISOMITTED(function)),
                           ISERROR(BYROW({1,1;2,2},function))),
    
        scenarios,     VSTACK(ShapeMismatch,InvalidDates,InvalidFn),
        
        errors,        {"#DIMENSIONS!";"#DATES!";"#INVALID-FUNCTION!"},
    
        //Logic gate
        FILTER(errors,scenarios,
    
    //Proceed
    LET(
        //Optional parameters
        fnλ,        IF(ISOMITTED(function),SUM,function),
        date_format,IF(ISOMITTED(date_format),"yyy",date_format),
    
        //Local functions
        CountCatλ,  LAMBDA(row,COUNTA(TEXTSPLIT(row,delimiter))),
        Parseλ,     LAMBDA(array,instance,
                           TEXTBEFORE(TEXTAFTER(delimiter&array&delimiter,
                           delimiter,instance,,,array),delimiter,,,,categories)),
    
        //Categories in each row
        k,          BYROW(categories,CountCatλ),
        
        //Indices
        i,          RepeatRowsλ(SEQUENCE(r₁),k),
        j,          Streakλ(i),
        
        //Row fields
        a,          RepeatRowsλ(row_detail,k),
        detail,     IF(a= 0, "Unknown",a),
        dates,      RepeatRowsλ(TEXT(row_dates,date_format),k),
        sort_key,   IFERROR(DATEVALUE(dates),VALUE(dates)),
        row_fields, HSTACK(sort_key,detail,dates),
    
        //Column fields
        unpack,     RepeatRowsλ(categories,k),
        categories, Parseλ(unpack,j),
    
        //Value field
        values,     RepeatRowsλ(values / k,k),
    
        //Aggregation
        agg,        DROP(PIVOTBY(row_fields,categories,values,fnλ,,0,,0),,1),
        agg
        ))));
    
    
    
    //-----------------------------------------------------------------------------------
    //Streakλ
    //-----------------------------------------------------------------------------------
    // Author:  Patrick H
    // Date:    12/22/25
    // Description:
    //   Calculate streaks of consecutive values within 1D or 2D arrays.
    //
    // Parameters:
    //   array             - Array (1D or 2D).
    //
    // Optional Parameters:
    //   [Horizontal?]     - For 1D arrays, calculate streaks column‑wise instead of row‑wise.
    //   [Include_Symbol?] - TRUE to include the symbol with streak counts (e.g., W·1, W·2, L·1).
    
    Streakλ=
    LAMBDA(
        array,[Horizontal?],[Include_Symbol?],
    
    // Handle scalar input
    IF(AND(ROWS(array)=1,COLUMNS(array)=1),array,
    LET(
        //Dimensions
        i,      ROWS(array),     j,COLUMNS(array),
        Is2D?,  AND(i>1,j>1),
    
        //Flatten if 2D array
        array,  IF(Is2D?,TOCOL(array),array),
    
        //Direction for stacking
        NoDir?, ISOMITTED(Horizontal?),
        fn,     IF(NoDir?,VSTACK,HSTACK),
    
        //Build true/false array for streak detection
        kept,   IF(NoDir?,DROP(array,-1),DROP(array,,-1)),
        bool,   array=fn("omit",kept),
    
        //Accumulate streak counts
        acc,    SCAN(0,bool,LAMBDA(a,v,IF(a=0,1,IF(v,a+1,1)))),
    
        //Optionally include symbol with streak counts
        result, IF(ISOMITTED(Include_Symbol?),acc,
                                              array&UNICHAR(183)&acc),
        
        //Wrap result back to 2D if original array was 2D
        deliver,IF(Is2D?,WRAPROWS(result,j),result),
    
        deliver
    )));
    
    //-----------------------------------------------------------------------------------
    //RepeatRowsλ
    //-----------------------------------------------------------------------------------
    //Author:   Patrick H.
    //Date:   	11/7/2025
    //Version:	1.0
    
    //Description:
    //Repeat each row in the supplied array by a corresponding repeat count.
    //Arrays and scalars are supported.
    //-----------------------------------------------------------------------------------
    //Parameters:
    //array    -    array to be repeated
    //repeat   -    1D array of repeat integers
    
    RepeatRowsλ =
    LAMBDA(
        array,
        repeat,
    
    //Check inputs
    IF(OR(ISTEXT(repeat),repeat<=0),#VALUE!,
    
    LET(
        //Flatten input
        repeat,  TOCOL(repeat),
    
        //Dimensions and row indexing
        A↕,      ROWS(array),  R↕,ROWS(repeat),
    
        r,       IF(A↕<>R↕,EXPAND(repeat,A↕,,@TAKE(repeat,-1)),
                     repeat),
    
        i,       SEQUENCE(ROWS(r)),
    
        m,       MAX(r),
    
        idx,     LAMBDA(TOCOL(IF(SIGN(r-SEQUENCE(,m,0))=1,i,NA()),2)),
    
        //Unwrap idx but defer delivery until function invocation
        deliver, LAMBDA(CHOOSEROWS(array,idx())),
    
        deliver
    ))());

    Pivot by row fields (one of which is a date), categories, and values.

     

    • Lorenzo's avatar
      Lorenzo
      Silver Contributor

      PQ option

      // TableSource
      let
          Source = Excel.CurrentWorkbook(){[Name="TableSource"]}[Content],
          ChangedType = Table.TransformColumnTypes( Source,
              { {"Amount", type number} }
          )
      in
          ChangedType
      
      
      // TablePivoted
      let
          Source = TableSource,
          ReplacedOrderDateAndRegion = Table.ReplaceValue( Source, each [OrderDate], "Unknown",
              // Works with the given source table. Otherwise would also replace null with "Unknown" in [OrderDate]
              (x,y,z) => if x is null then z else Date.Year( DateTime.Date(x) ),
              {"OrderDate", "Region"}
          ),
          FlaggedMultiCategories = Table.AddColumn( ReplacedOrderDateAndRegion, "Flag",
              each Text.Contains( [Categories], ";" ), type logical
          ),
          SplittedCategoriesAndAmount = Table.Combine(
              {
                  fxFilterTable( FlaggedMultiCategories, "Flag", false ),
      
                  Table.ExpandListColumn(
                      Table.ReplaceValue(
                          fxFilterTable( FlaggedMultiCategories, "Flag", true ),
                          each [Amount], each Text.Split( [Categories], ";" ),
                          (x,y,z) => if Value.Type(x) = type number then x / List.Count(z) else z,
                          {"Amount", "Categories"}
                      ),
                      "Categories"
                  )
              }
          ),
          TotalAmounts = Table.Group( SplittedCategoriesAndAmount,
              {"OrderDate", "Region", "Categories"},
              {"Amount", each List.Sum( [Amount] ), type number}
          ),
          DistinctCategories = List.Buffer(
              List.Sort( List.Distinct( Table.Column( TotalAmounts, "Categories") ) )
          ),
          PivotedCategories = Table.Group( TotalAmounts,
              {"Region", "OrderDate"},
              { "Pivoted",
                  (nested_table) => fxPivotTable( nested_table, "Categories", "Amount"),
                  fxTypeForTable( DistinctCategories, type nullable number)
              }
          ),
          SortedRegionAndOrderDate = Table.Sort( PivotedCategories,
              { {"Region", Order.Ascending}, {"OrderDate", Order.Ascending} }
          ),
          ExpandedPivotedTables = Table.ExpandTableColumn( SortedRegionAndOrderDate, "Pivoted",
              DistinctCategories
          ),
          ReplacedNullAmounts = Table.ReplaceValue( ExpandedPivotedTables, null, 0,
              Replacer.ReplaceValue, DistinctCategories
          ),
          TypedOtherColumns = Table.TransformColumnTypes( ReplacedNullAmounts,
              { {"Region", type text}, {"OrderDate", Int64.Type} }
          )
      in
          TypedOtherColumns
      
      
      // fxFilterTable
      (table as table, filter_field as text, filter_value as any) as table =>
      Table.RemoveColumns(
          Table.SelectRows( table, each Record.Field( _, filter_field ) = filter_value ),
          filter_field
      )
      
      
      // fxPivotTable
      (
          table as table,
          optional attributeColumn as nullable text,
          optional valueColumn as nullable text
      ) as table =>
          Table.Transpose(
              Table.SelectColumns( table, if valueColumn is null then "Value" else valueColumn ),
              Table.Column( table, if attributeColumn is null then "Attribute" else attributeColumn )
          )
      
      
      // fxTypeForTable
      (List as list, Type as type) as type =>
          type table Type.ForRecord(
              List.Accumulate( List, [], 
                  (state,current) =>
                      Record.AddField( state, current, [Type = Type, Optional=false] )
              ),
              false
          )

       

  • djclements's avatar
    djclements
    Silver Contributor

    Good stuff, Peter! Nice demonstration of the power of your custom function collection for evaluating nested arrays.

    I see the majority of the other dynamic array solutions offered up in the comments section of that LinkedIn post use some variation of DROP-REDUCE-VSTACK to perform each step on a row-by-row basis. In this particular situation, however, it is possible to conduct a TOCOL-broadcasting method (unpivot) over the entire table and pass the normalized fields directly to PIVOTBY:

    =LET(
        num, LEN(Categories)-LEN(SUBSTITUTE(Categories,";",))+1,
        arr, TEXTBEFORE(TEXTAFTER(";"&Categories&";",";",SEQUENCE(,MAX(num))),";"),
        fnλ, LAMBDA(err,LAMBDA(vec,TOCOL(IF(err,arr,vec),2)))(ISERROR(arr)),
        PIVOTBY(HSTACK(fnλ(IF(ISBLANK(Region),"Unknown",Region)),fnλ(YEAR(OrderDate))),TOCOL(arr,2),fnλ(Amount/num),SUM,0,0,,0)
    )

    Or with CHOOSEROWS and a single broadcast of the table row IDs:

    =LET(
        num, LEN(Categories)-LEN(SUBSTITUTE(Categories,";",))+1,
        arr, TEXTBEFORE(TEXTAFTER(";"&Categories&";",";",SEQUENCE(,MAX(num))),";"),
        fnλ, LAMBDA(rId,LAMBDA(vec,CHOOSEROWS(vec,rId)))(TOCOL(IF(ISERROR(arr),arr,SEQUENCE(ROWS(arr))),2)),
        PIVOTBY(HSTACK(fnλ(IF(ISBLANK(Region),"Unknown",Region)),fnλ(YEAR(OrderDate))),TOCOL(arr,2),fnλ(Amount/num),SUM,0,0,,0)
    )

    Kind regards.

    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      David, Interesting solutions!  I think that, where possible, obtaining 2D arrays by broadcasting vectors is more efficient than resorting to arrays of functions.  Against that, I have been trying to avoid idiosyncratic solutions that are specific to a problem and also to increase my use of arrays of functions (so called 1st class citizens of the new environment).  The trouble is that I am so focussed on arrays of functions as the base element of all calculation that I sometimes overlook broadcasting where it is applicable.

  • NikolinoDE's avatar
    NikolinoDE
    Platinum Contributor

    This is a very solid—and very modern—Excel solution. You’ve essentially built a small, well-thought-out transformation pipeline using LAMBDA functions.

    The way you normalize the data first and only then pivot is especially elegant, and your custom BYROW approach neatly sidesteps one of Excel’s more frustrating limitations.

    A great example of how far “formula-only” Excel has come, and a very instructive solution to the challenge.

    Thank you for sharing it with us.

Resources