Forum Discussion
A generalisation of the MAP lambda helper function
Discussion topic. Your thoughts are welcome.
On Saturday I finally bit the bullet and completed a MAPλ Lambda function that generalises the in-built MAP Lambda helper function. As examples, I tried problems of generating the Kronecker product of two matrices and then one of generating variants of an amortisation table.
The original amortisation schedule uses SCAN to calculate closing balances step by step from opening balances. Having returned the closing balances as an array, the principal is inserted at the first element to give opening balances. An array calculation based on the same code is used to return other values of interest using HSTACK.
Following that, I created the array of loan terms {10, 15, 20} (yrs) and used the formula
= MAPλ(variousTerms, AmortisationTableλ(principal, rate, startYear))
to generate
as a single spilt range.
I have posted a copy of MAPλ on GitHub
A version of Excel MAP helper function that will return an array of arrays (github.com)
The intention is that the function can be used without knowing how it works but you are, of course, welcome to try to pick through it.
This is a demonstration of the proposed MAP function used to solve a challenge provided by Oz_Sunray and Wyn Hopkins on data cleansing and vote counting. The key element of my solution is to build a table of validated votes for each voter ID and each candidate (i.e. an array of arrays - much the same as any other problem of interest). The formula is modularised using Lambda functions. They are not that pretty, but neither are they impossible to follow.
"Worksheet formula" = AggregateVotesλ( tblVOTES[VoterID], tblVOTES[[Vote1]:[Vote3]], weight, tblAvailable[AVAILABLE] )
which calls
/* "FUNCTION NAME: AggregateVotesλ" "DESCRIPTION: Aggregates all votes to candidate colors" */ /* "REVISIONS: Date Developer Description" " 08 Sep 2024 Peter Bartholomew Original Development" */ AggregateVotesλ = LAMBDA(voterID, votesCast, weight, available, LET( person, TOROW(UNIQUE(voterID)), votes, MAPλ(person, AssignVotesλ(voterID, votesCast, weight, available)), count, HSTACK(available, BYROW(votes, SUM)), return, TAKE(SORT(count, 2, -1), 4), return ) );
to build and aggregate the table of votes. This, in turn, calls another Lambda function to clean and validate the votes corresponding to a single voterID
/* "FUNCTION NAME: AssignVotesλ" "DESCRIPTION: Validates and assigns votes to candidate colors for a single voterID " */ /* "REVISIONS: Date Developer Description" " 08 Sep 2024 Peter Bartholomew Original Development" */ AssignVotesλ = LAMBDA(voterID, ballots, weight, available, LAMBDA(p, LET( // "In the case of duplicate ballots return first" votesCast, XLOOKUP(p, voterID, ballots), // "Remove invalid and repeated votes" valid, COUNTIFS(available, votesCast), distinct, EXPAND(UNIQUE(FILTER(votesCast, valid), TRUE), , 3, ""), // "Assign votes to candidate colours" assigned, BYROW((distinct = available) * weight, SUM), assigned ) ) );
I took the problem from Wyn's recording
https://www.youtube.com/watch?v=FIcxyLqzWcE
25 Replies
- heyarrayCopper Contributor
Has anyone figured out how to do a variation that has a different shape?
- Patrick2788Silver Contributor
I hope the Excel team takes note of your terrific work here in working around the array of arrays limitations (and doing it very efficiently calculation wise). I'm still thinking of all the possibilities of where I might use Mapλ. It certainly broadens the horizons!
Re: Kronecker product
It seems silly we must jump through hoops to obtain a Kronecker product when Python and numpy can do it out of the box:
a = xl("firstArray") b = xl("secondArray") result = np.kron(a,b)
- Patrick2788Silver Contributor
I'm having some fun with this function! A simple text to columns split with a formula. I know it's easy to do with the feature text to columns or PQ but it's more fun with formulas where I don't need to cobble together a solution with MAKEARRAY or resort to mischief with REDUCE. It's also much faster than a recursive function I built a while ago.
=LET( split, LAMBDA(texts, TEXTSPLIT(texts, ", ")), MAPλ(attributes, split) )
- SergeiBaklanDiamond Contributor
Sorry for off topic, tried your sample with Office Script
function main(workbook: ExcelScript.Workbook) { let width = 0; const attributes = workbook.getActiveWorksheet() .getRange("attributes") const texts = attributes.getValues() const targetCell = attributes.getCell(0,1) texts.map(x => { let i = x.toString().match(/,/g).length; width = i > width ? i : width }) const textsItems = texts.map( x => { let y = x.toString().split(","); y[width+1] = ""; return y }) const target = targetCell .getResizedRange(textsItems.length - 1, width + 1) const format: ExcelScript.RangeFormat = target.getFormat() target.setValues(textsItems) format.autofitColumns() format.setVerticalAlignment(ExcelScript.VerticalAlignment.center) }
- PeterBartholomew1Silver Contributor
Thanks for the demonstration! It achieves some of the goals I had in mind brilliantly. It simultaneously demonstrates both the power of the function against a sizable problem and confirms its usability in that all the 'clever' machinations are discretely out of sight.
By using a defined name to hold your Lambda function, the worksheet formula can be made to appear even simpler
"Worksheet formula" = MAPλ(attributes, SPLITCSV); "where" SPLITCSV = LAMBDA(text, TEXTSPLIT(text, ", ", CHAR(10)))
If only it worked that way with the native Excel function!
- PeterBartholomew1Silver Contributor
I have extended the function MAPλ I posted on GitHub
A version of Excel MAP helper function that will return an array of arrays (github.com)
to accept up to 3 arrays before the user's Lambda function. Before I had only really used it with one array. I show an application with the 3 arrays in
Loop through table column with spill formula - Microsoft Community Hub
My worksheet formula was
= MAPλ(Table2[Begin], Table2[Eind], Table2[Hours], UurPerDagλ)
Other posts demonstrated that REDUCE/VSTACK works perfectly satisfactorily for such problems but they require the use to have some understanding of Thunks whereas MAPλ hides such complexities from the user.
- PeterBartholomew1Silver Contributor
This is a demonstration of the proposed MAP function used to solve a challenge provided by Oz_Sunray and Wyn Hopkins on data cleansing and vote counting. The key element of my solution is to build a table of validated votes for each voter ID and each candidate (i.e. an array of arrays - much the same as any other problem of interest). The formula is modularised using Lambda functions. They are not that pretty, but neither are they impossible to follow.
"Worksheet formula" = AggregateVotesλ( tblVOTES[VoterID], tblVOTES[[Vote1]:[Vote3]], weight, tblAvailable[AVAILABLE] )
which calls
/* "FUNCTION NAME: AggregateVotesλ" "DESCRIPTION: Aggregates all votes to candidate colors" */ /* "REVISIONS: Date Developer Description" " 08 Sep 2024 Peter Bartholomew Original Development" */ AggregateVotesλ = LAMBDA(voterID, votesCast, weight, available, LET( person, TOROW(UNIQUE(voterID)), votes, MAPλ(person, AssignVotesλ(voterID, votesCast, weight, available)), count, HSTACK(available, BYROW(votes, SUM)), return, TAKE(SORT(count, 2, -1), 4), return ) );
to build and aggregate the table of votes. This, in turn, calls another Lambda function to clean and validate the votes corresponding to a single voterID
/* "FUNCTION NAME: AssignVotesλ" "DESCRIPTION: Validates and assigns votes to candidate colors for a single voterID " */ /* "REVISIONS: Date Developer Description" " 08 Sep 2024 Peter Bartholomew Original Development" */ AssignVotesλ = LAMBDA(voterID, ballots, weight, available, LAMBDA(p, LET( // "In the case of duplicate ballots return first" votesCast, XLOOKUP(p, voterID, ballots), // "Remove invalid and repeated votes" valid, COUNTIFS(available, votesCast), distinct, EXPAND(UNIQUE(FILTER(votesCast, valid), TRUE), , 3, ""), // "Assign votes to candidate colours" assigned, BYROW((distinct = available) * weight, SUM), assigned ) ) );
I took the problem from Wyn's recording
https://www.youtube.com/watch?v=FIcxyLqzWcE
- Patrick2788Silver Contributor
Finally getting to this one!
In re: Wyn's challenge.
Here's my take:
=LET( flat, UnPivotMλ( tblVOTES[VoterID], weight, tblVOTES[[Vote1]:[Vote3]] ), colors, TAKE(UNIQUE(flat), , -1), agg, SORT(GROUPBY(colors, colors, COUNTA, 0, 0), 2, -1), TAKE(agg, 4) ) //UnPivotMλ being: UnPivotMλ = LAMBDA(rows, columns, values, LET( i, ROWS(rows), j, COLUMNS(rows), c, COLUMNS(columns), s, TOCOL(SEQUENCE(, j) * SEQUENCE(c, , 1, 0)), row_labels, WRAPROWS(TOCOL(CHOOSECOLS(rows, s)), j), attribute, TOCOL(CHOOSEROWS(columns, SEQUENCE(i, , 1, 0))), v, TOCOL(values), HSTACK(row_labels, attribute, v) ) )
- djclementsBronze Contributor
PeterBartholomew1 Once again, great job on producing a generalized solution to the nested arrays issue. Hopefully Microsoft will take note of your efforts and come up with a built-in solution in the future.
Your reference to Oz_Sunray's live challenge caught my attention, as I too had come across that video in my YouTube feed. A lambda-free variant to avoid the array of arrays issue for that particular challenge could be:
=LET( names, UNIQUE(Votes[Ballot]), colors, INDEX(Votes, XMATCH(names, Votes[Ballot]), {2,3,4}), names_colors, TOCOL(names & "|" & colors), keys, UNIQUE(names_colors), weight, XLOOKUP(keys, names_colors, TOCOL(IF({1}, {3,2,1}, colors))), TAKE(SORT(HSTACK(Available, MMULT(N(Available = TOROW(TEXTAFTER(keys, "|"))), weight)), 2, -1), 4) )
Note: the weight array can be adjusted as desired (e.g. {1,1,1} instead of {3,2,1}).
This is one situation where INDEX / XMATCH is preferred over XLOOKUP, as it is perfectly capable of returning multiple columns of data for multiple lookup values, whereas XLOOKUP is not.
While I typically seek out alternative methods to overcome the nested arrays issue, complex scenarios can quickly become convoluted, making it difficult for the average user to follow. Your MAPλ function seems to take care of that nicely. Thank you for sharing!
- PeterBartholomew1Silver Contributor
Whilst we are chatting, you mention a 'Lambda free' variant. In general terms I am, at present' travelling in the opposite direction; that is trying to turn every formula into a Lambda. The reasoning goes, a basic formula exposes the functions it uses and, scattered through, the references it makes to data. Conversely, a named Lambda conceals the 'how' but describes the task it is performing and lists its precedents. The 'how' is concealed but it is available by listing the function.
- Very cool! 👏🏽👏🏽👏🏽
- Patrick2788Silver ContributorI've added a like and a note that I may have more to say once I've had time to look this one over. This looks like a big meal!
- PeterBartholomew1Silver Contributor
I have posted a further application of the MAPλ using the OP's (joelb95) workbook. In the workbook I have applied the home-rolled helper function both to the task of maintaining multiple running totals and to a FIFO calculation of the same 'Lots'.
The FIFO calculation filters the event table by each 'Lot' in turn but MAPλ stack the tables as an array.
As an alternative to the array of tables the user can select an array of grouped tables that focus on the outputs along with associated input costs.
A key element of both MAPλ and the FIFOλ functions is that they are much easier to use than they were to write. But then, that is the whole idea of using modern methods and LAMBDA in particular!
- PeterBartholomew1Silver Contributor
Thanks for the preliminary feedback. You are correct; it would be a large and somewhat indigestible meal taken at one sitting! Its saving grace might be that it is possible to pick the small appetiser off the top without getting involved with anything too rich.
The Lambda helper function itself is intended to hide all its complexity from the user. I wrote it as a stop-gap until such time as Microsoft get their act together and write versions of their helper functions that will generate arrays of arrays rather than error messages. I understand Google Sheets does not have this limitation.
My idea in publishing the MAPλ helper function at this stage is to get others to try it. All that is required is to write a function that generates an array and then use MAP to specify a repeated structure that will fail with the nested array error. Then bring the MAPλ helper function into the AFE from GitHub and use IntelliSense autocomplete to select MAPλ in place of MAP. Everything should just work with no further effort on behalf of the user.
These days most of the formulas I write today generate arrays of arrays in order to make best use of the spreadsheet grid. Sometimes it is possible by using broadcasting but, as soon as a helper function is involved, things start to fall apart. REDUCE/_STACK offers an understandable solution for smaller problems but it runs out of control as the problem size increases.
If you to try the function, let me know how you get on. Did it deliver what you require and did it appear to be efficient?
- Patrick2788Silver Contributor
I'm having fun with this function.
=LET( first, SEQUENCE(3, 3), second, SEQUENCE(10000, 12), MAPλ(first, LAMBDA(v, v * second)) )
For the amount of lifting I'm asking it to do the calculation speed is impressive. I'm using 32 bit Excel at work and the average speed of a fullcalc is about 1.9 seconds. Not bad at all!