Forum Discussion

PeterBartholomew1's avatar
PeterBartholomew1
Silver Contributor
Sep 02, 2024
Solved

A generalisation of the MAP lambda helper function

Discussion topic.  Your thoughts are welcome.

 

On Saturday I finally bit the bullet and completed a MAPλ Lambda function that generalises the in-built MAP Lambda helper function.  As examples, I tried problems of generating the Kronecker product of two matrices and then one of generating variants of an amortisation table.

 

The original amortisation schedule uses SCAN to calculate closing balances step by step from opening balances.  Having returned the closing balances as an array, the principal is inserted at the first element to give opening balances.  An array calculation based on the same code is used to return other values of interest using HSTACK.

Following that, I created the array of loan terms {10, 15, 20} (yrs) and used the formula

 

 

= MAPλ(variousTerms, AmortisationTableλ(principal, rate, startYear))

 

 

to generate

as a single spilt range.

 

I have posted a copy of MAPλ on GitHub 

A version of Excel MAP helper function that will return an array of arrays (github.com)

The intention is that the function can be used without knowing how it works but you are, of course, welcome to try to pick through it.

  • This is a demonstration of the proposed MAP function used to solve a challenge provided by Oz_Sunray and Wyn Hopkins on data cleansing and vote counting.  The key element of my solution is to build a table of validated votes for each voter ID and each candidate (i.e. an array of arrays - much the same as any other problem of interest).  The formula is modularised using Lambda functions.  They are not that pretty, but neither are they impossible to follow.

    "Worksheet formula"
    = AggregateVotesλ(
          tblVOTES[VoterID],
          tblVOTES[[Vote1]:[Vote3]],
          weight,
          tblAvailable[AVAILABLE]
      )

    which calls

    /*  "FUNCTION NAME:  AggregateVotesλ"
        "DESCRIPTION:    Aggregates all votes to candidate colors" */
    /*  "REVISIONS:      Date            Developer           Description"
        "                08 Sep 2024     Peter Bartholomew   Original Development"
    */
    AggregateVotesλ
    = LAMBDA(voterID, votesCast, weight, available,
         LET(
            person, TOROW(UNIQUE(voterID)),
            votes,  MAPλ(person, AssignVotesλ(voterID, votesCast, weight, available)),
            count,  HSTACK(available, BYROW(votes, SUM)),
            return, TAKE(SORT(count, 2, -1), 4),
            return
         )
      );

    to build and aggregate the table of votes.  This, in turn, calls another Lambda function to clean and validate the votes corresponding to a single voterID

    /*  "FUNCTION NAME:  AssignVotesλ"
        "DESCRIPTION:   Validates and assigns votes to candidate colors for a single voterID " */
    /*  "REVISIONS:      Date            Developer           Description"
        "                08 Sep 2024     Peter Bartholomew   Original Development"
    */
    AssignVotesλ
    = LAMBDA(voterID, ballots, weight, available,
        LAMBDA(p,
            LET(
                // "In the case of duplicate ballots return first"
                votesCast, XLOOKUP(p, voterID, ballots),
                // "Remove invalid and repeated votes"
                valid,     COUNTIFS(available, votesCast),
                distinct,  EXPAND(UNIQUE(FILTER(votesCast, valid), TRUE), , 3, ""),
                // "Assign votes to candidate colours"
                assigned,  BYROW((distinct = available) * weight, SUM),
                assigned
            )
        )
    );

    I took the problem from Wyn's recording

    https://www.youtube.com/watch?v=FIcxyLqzWcE

     

25 Replies

  • heyarray's avatar
    heyarray
    Copper Contributor

    Has anyone figured out how to do a variation that has a different shape?

     

  • Patrick2788's avatar
    Patrick2788
    Silver Contributor

    PeterBartholomew1 

    I hope the Excel team takes note of your terrific work here in working around the array of arrays limitations (and doing it very efficiently calculation wise).  I'm still thinking of all the possibilities of where I might use Mapλ.  It certainly broadens the horizons!

     

    Re: Kronecker product

    It seems silly we must jump through hoops to obtain a Kronecker product when Python and numpy can do it out of the box:

    a = xl("firstArray")
    b = xl("secondArray")
    
    result = np.kron(a,b)

       

  • Patrick2788's avatar
    Patrick2788
    Silver Contributor

    PeterBartholomew1 

    I'm having some fun with this function! A simple text to columns split with a formula. I know it's easy to do with the feature text to columns or PQ but it's more fun with formulas where I don't need to cobble together a solution with MAKEARRAY or resort to mischief with REDUCE. It's also much faster than a recursive function I built a while ago.

     

    =LET(
        split, LAMBDA(texts, TEXTSPLIT(texts, ", ")),
        MAPλ(attributes, split)
    )

     

     

    • SergeiBaklan's avatar
      SergeiBaklan
      Diamond Contributor

      Patrick2788 

      Sorry for off topic, tried your sample with Office Script

      function main(workbook: ExcelScript.Workbook) {
      
          let width = 0; 
          const attributes = workbook.getActiveWorksheet()
              .getRange("attributes")
          const texts = attributes.getValues()
          const targetCell = attributes.getCell(0,1)
      
          texts.map(x => {
              let i = x.toString().match(/,/g).length;
              width = i > width ? i : width
          })
      
          const textsItems = texts.map( x => {
              let y = x.toString().split(",");
              y[width+1] = "";
              return y
          }) 
      
          const target = targetCell
              .getResizedRange(textsItems.length - 1, width + 1)
          const format: ExcelScript.RangeFormat = target.getFormat()
      
          target.setValues(textsItems)
          format.autofitColumns()
          format.setVerticalAlignment(ExcelScript.VerticalAlignment.center)
      
      }
    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      Patrick2788 

      Thanks for the demonstration!   It achieves some of the goals I had in mind brilliantly.  It simultaneously demonstrates both the power of the function against a sizable problem and confirms its usability in that all the 'clever' machinations are discretely out of sight.

       

      By using a defined name to hold your Lambda function, the worksheet formula can be made to appear even simpler

      "Worksheet formula"
      = MAPλ(attributes, SPLITCSV);
      
      "where"
      SPLITCSV
      = LAMBDA(text, TEXTSPLIT(text, ", ", CHAR(10)))

       If only it worked that way with the native Excel function!

  • PeterBartholomew1 

    I have extended the function MAPλ I posted on GitHub 

    A version of Excel MAP helper function that will return an array of arrays (github.com)

    to accept up to 3 arrays before the user's Lambda function.  Before I had only really used it with one array.  I show an application with the 3 arrays in

    Loop through table column with spill formula - Microsoft Community Hub

    My worksheet formula was

    = MAPλ(Table2[Begin], Table2[Eind], Table2[Hours], UurPerDagλ)

    Other posts demonstrated that REDUCE/VSTACK works perfectly satisfactorily for such problems but they require the use to have some understanding of Thunks whereas MAPλ hides such complexities from the user.

     

  • This is a demonstration of the proposed MAP function used to solve a challenge provided by Oz_Sunray and Wyn Hopkins on data cleansing and vote counting.  The key element of my solution is to build a table of validated votes for each voter ID and each candidate (i.e. an array of arrays - much the same as any other problem of interest).  The formula is modularised using Lambda functions.  They are not that pretty, but neither are they impossible to follow.

    "Worksheet formula"
    = AggregateVotesλ(
          tblVOTES[VoterID],
          tblVOTES[[Vote1]:[Vote3]],
          weight,
          tblAvailable[AVAILABLE]
      )

    which calls

    /*  "FUNCTION NAME:  AggregateVotesλ"
        "DESCRIPTION:    Aggregates all votes to candidate colors" */
    /*  "REVISIONS:      Date            Developer           Description"
        "                08 Sep 2024     Peter Bartholomew   Original Development"
    */
    AggregateVotesλ
    = LAMBDA(voterID, votesCast, weight, available,
         LET(
            person, TOROW(UNIQUE(voterID)),
            votes,  MAPλ(person, AssignVotesλ(voterID, votesCast, weight, available)),
            count,  HSTACK(available, BYROW(votes, SUM)),
            return, TAKE(SORT(count, 2, -1), 4),
            return
         )
      );

    to build and aggregate the table of votes.  This, in turn, calls another Lambda function to clean and validate the votes corresponding to a single voterID

    /*  "FUNCTION NAME:  AssignVotesλ"
        "DESCRIPTION:   Validates and assigns votes to candidate colors for a single voterID " */
    /*  "REVISIONS:      Date            Developer           Description"
        "                08 Sep 2024     Peter Bartholomew   Original Development"
    */
    AssignVotesλ
    = LAMBDA(voterID, ballots, weight, available,
        LAMBDA(p,
            LET(
                // "In the case of duplicate ballots return first"
                votesCast, XLOOKUP(p, voterID, ballots),
                // "Remove invalid and repeated votes"
                valid,     COUNTIFS(available, votesCast),
                distinct,  EXPAND(UNIQUE(FILTER(votesCast, valid), TRUE), , 3, ""),
                // "Assign votes to candidate colours"
                assigned,  BYROW((distinct = available) * weight, SUM),
                assigned
            )
        )
    );

    I took the problem from Wyn's recording

    https://www.youtube.com/watch?v=FIcxyLqzWcE

     

    • Patrick2788's avatar
      Patrick2788
      Silver Contributor

      Finally getting to this one!

      In re: Wyn's challenge.

      Here's my take:

      =LET(
          flat, UnPivotMλ(
              tblVOTES[VoterID],
              weight,
              tblVOTES[[Vote1]:[Vote3]]
          ),
          colors, TAKE(UNIQUE(flat), , -1),
          agg, SORT(GROUPBY(colors, colors, COUNTA, 0, 0), 2, -1),
          TAKE(agg, 4)
      )
      
      //UnPivotMλ being:
      
      UnPivotMλ = LAMBDA(rows, columns, values,
              LET(
                  i, ROWS(rows),
                  j, COLUMNS(rows),
                  c, COLUMNS(columns),
                  s, TOCOL(SEQUENCE(, j) * SEQUENCE(c, , 1, 0)),
                  row_labels, WRAPROWS(TOCOL(CHOOSECOLS(rows, s)), j),
                  attribute, TOCOL(CHOOSEROWS(columns, SEQUENCE(i, , 1, 0))),
                  v, TOCOL(values),
                  HSTACK(row_labels, attribute, v)
              )
          )

       

    • djclements's avatar
      djclements
      Bronze Contributor

      PeterBartholomew1 Once again, great job on producing a generalized solution to the nested arrays issue. Hopefully Microsoft will take note of your efforts and come up with a built-in solution in the future.

       

      Your reference to Oz_Sunray's live challenge caught my attention, as I too had come across that video in my YouTube feed. A lambda-free variant to avoid the array of arrays issue for that particular challenge could be:

       

      =LET(
          names, UNIQUE(Votes[Ballot]),
          colors, INDEX(Votes, XMATCH(names, Votes[Ballot]), {2,3,4}),
          names_colors, TOCOL(names & "|" & colors),
          keys, UNIQUE(names_colors),
          weight, XLOOKUP(keys, names_colors, TOCOL(IF({1}, {3,2,1}, colors))),
          TAKE(SORT(HSTACK(Available, MMULT(N(Available = TOROW(TEXTAFTER(keys, "|"))), weight)), 2, -1), 4)
      )

       

      Note: the weight array can be adjusted as desired (e.g. {1,1,1} instead of {3,2,1}).

       

      This is one situation where INDEX / XMATCH is preferred over XLOOKUP, as it is perfectly capable of returning multiple columns of data for multiple lookup values, whereas XLOOKUP is not.

       

      While I typically seek out alternative methods to overcome the nested arrays issue, complex scenarios can quickly become convoluted, making it difficult for the average user to follow. Your MAPλ function seems to take care of that nicely. Thank you for sharing!

      • PeterBartholomew1's avatar
        PeterBartholomew1
        Silver Contributor

        djclements 

        Whilst we are chatting, you mention a 'Lambda free' variant.  In general terms I am, at present' travelling in the opposite direction; that is trying to turn every formula into a Lambda.  The reasoning goes, a basic formula exposes the functions it uses and, scattered through, the references it makes to data.  Conversely, a named Lambda conceals the 'how' but describes the task it is performing and lists its precedents.  The 'how' is concealed but it is available by listing the function.

  • Patrick2788's avatar
    Patrick2788
    Silver Contributor
    I've added a like and a note that I may have more to say once I've had time to look this one over. This looks like a big meal!
    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      I have posted a further application of the MAPλ using the OP's (joelb95) workbook.  In the workbook I have applied the home-rolled helper function both to the task of maintaining multiple running totals and to a FIFO calculation of the same 'Lots'.

      Re: Your thoughts serializing arrays to manage state in excel (by way of inventory tracking) - Microsoft Community Hub

      The FIFO calculation filters the event table by each 'Lot' in turn but MAPλ stack the tables as an array.

      As an alternative to the array of tables the user can select an array of grouped tables that focus on the outputs along with associated input costs.

       

      A key element of both MAPλ and the FIFOλ functions is that they are much easier to use than they were to write.  But then, that is the whole idea of using modern methods and LAMBDA in particular!

    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      Patrick2788 

      Thanks for the preliminary feedback.  You are correct; it would be a large and somewhat indigestible meal taken at one sitting!  Its saving grace might be that it is possible to pick the small appetiser off the top without getting involved with anything too rich.

       

      The Lambda helper function itself is intended to hide all its complexity from the user.  I wrote it as a stop-gap until such time as Microsoft get their act together and write versions of their helper functions that will generate arrays of arrays rather than error messages.  I understand Google Sheets does not have this limitation.

       

      My idea in publishing the MAPλ helper function at this stage is to get others to try it.  All that is required is to write a function that generates an array and then use MAP to specify a repeated structure that will fail with the nested array error.  Then bring the MAPλ helper function into the AFE from GitHub and use IntelliSense autocomplete to select MAPλ in place of MAP.  Everything should just work with no further effort on behalf of the user.

       

      These days most of the formulas I write today generate arrays of arrays in order to make best use of the spreadsheet grid.  Sometimes it is possible by using broadcasting but, as soon as a helper function is involved, things start to fall apart.  REDUCE/_STACK offers an understandable solution for smaller problems but it runs out of control as the problem size increases.

       

      If you to try the function, let me know how you get on.  Did it deliver what you require and did it appear to be efficient?

      • Patrick2788's avatar
        Patrick2788
        Silver Contributor

        PeterBartholomew1 

        I'm having fun with this function.

         

         

        =LET(
            first, SEQUENCE(3, 3),
            second, SEQUENCE(10000, 12),
            MAPλ(first, LAMBDA(v, v * second))
        )

         

        For the amount of lifting I'm asking it to do the calculation speed is impressive. I'm using 32 bit Excel at work and the average speed of a fullcalc is about 1.9 seconds. Not bad at all!

         

Resources