Forum Discussion
A generalisation of the MAP lambda helper function
- Sep 08, 2024
This is a demonstration of the proposed MAP function used to solve a challenge provided by Oz_Sunray and Wyn Hopkins on data cleansing and vote counting. The key element of my solution is to build a table of validated votes for each voter ID and each candidate (i.e. an array of arrays - much the same as any other problem of interest). The formula is modularised using Lambda functions. They are not that pretty, but neither are they impossible to follow.
"Worksheet formula" = AggregateVotesλ( tblVOTES[VoterID], tblVOTES[[Vote1]:[Vote3]], weight, tblAvailable[AVAILABLE] )
which calls
/* "FUNCTION NAME: AggregateVotesλ" "DESCRIPTION: Aggregates all votes to candidate colors" */ /* "REVISIONS: Date Developer Description" " 08 Sep 2024 Peter Bartholomew Original Development" */ AggregateVotesλ = LAMBDA(voterID, votesCast, weight, available, LET( person, TOROW(UNIQUE(voterID)), votes, MAPλ(person, AssignVotesλ(voterID, votesCast, weight, available)), count, HSTACK(available, BYROW(votes, SUM)), return, TAKE(SORT(count, 2, -1), 4), return ) );
to build and aggregate the table of votes. This, in turn, calls another Lambda function to clean and validate the votes corresponding to a single voterID
/* "FUNCTION NAME: AssignVotesλ" "DESCRIPTION: Validates and assigns votes to candidate colors for a single voterID " */ /* "REVISIONS: Date Developer Description" " 08 Sep 2024 Peter Bartholomew Original Development" */ AssignVotesλ = LAMBDA(voterID, ballots, weight, available, LAMBDA(p, LET( // "In the case of duplicate ballots return first" votesCast, XLOOKUP(p, voterID, ballots), // "Remove invalid and repeated votes" valid, COUNTIFS(available, votesCast), distinct, EXPAND(UNIQUE(FILTER(votesCast, valid), TRUE), , 3, ""), // "Assign votes to candidate colours" assigned, BYROW((distinct = available) * weight, SUM), assigned ) ) );
I took the problem from Wyn's recording
https://www.youtube.com/watch?v=FIcxyLqzWcE
I've been experimenting with thunks over the past month or so and have a pretty good handle on it now. However, the binary tree method for unpacking a thunked array of arrays is still a bit of a mind twist for me. The examples that both you and lori_m have shared are some of the best resources I've found on this topic, so thank you for that.
By a "lambda-free" variant, what I really meant to say was a solution that spills naturally via lifting or broadcasting, without having to rely on any of the lambda helper functions (BYROW, MAP, REDUCE, etc.). MMULT was a function that I always struggled with but have a better grasp of now as I've been using it with increased frequency. I'm all for defining custom lambda functions, but typically only do so if they will be called more than once.
On a side note, I'm happy to report that I now have access to both GROUPBY and PIVOTBY (after applying the September 10th update on the Current Channel), as well as eta-lambda for all of the lambda helper functions. Still waiting on the new regex functions and Python, but they're not too high on my wish list. I'm more excited about the upcoming TRIMRANGE function and Trim Refs, which will make it much easier to dynamically reference a data input range without having to use structured tables.
As always, your contributions to the community are invaluable, and I look forward to seeing what you come up with next. Cheers!
What I have been working on is a two way version of SCAN that accumulates across from the left as well as downwards from the top (initialisation) row. I show the function being used to evaluate combinations but that is not its intended use.
It is continually switching arrays into functions (thunks) and back. It is basically a top-to-bottom array SCAN but the function evaluated at each step is itself a SCAN across with a two cell range.
I have also used the function SCANARRλ to calculate the Levenshtein distance between two strings with the user provided function
I will see what further progress I can make.
I have attached a file to show the work-in-progress. 25/09/2024
- djclementsSep 21, 2024Bronze Contributor
PeterBartholomew1 Additional feedback for MAPλ:
An interesting little tidbit I noticed when trying a basic Kronecker product is that it doesn't seem to work when using a range reference directly within the LAMBDA function. For example, this returns all zeros...
=MAPλ(A1:C2, LAMBDA(v, v * E1:H4))
MAPλ with direct range reference
However, this works as expected...
=LET( rng, E1:H4, MAPλ(A1:C2, LAMBDA(v, v * rng)) )
MAPλ with defined range reference
Was this intentional? Or is it a bug?
- PeterBartholomew1Sep 23, 2024Silver Contributor
Its not intentional. It's not something I would have noticed because it is 8 years since I last used a direct cell reference (other than to define a name). The error doesn't even seem to be consistent; sometimes it works, sometimes it doesn't.
- SergeiBaklanSep 23, 2024Diamond Contributor
It works correctly if we use absolute reference
=MAPλ(A1:C3, LAMBDA(v, v * $E$1:$H$4 ) )
If in function is relative address reference, even partly, result very depends on at which place formula is entered and are there any data around. With each iteration function shifts address for the range within lambda.