Forum Discussion

Silver Contributor

Oct 02, 2024

Double Thunking Works Wonders!

Given that most Excel users would not dream of employing one thunk, you might well ask why even consider nested thunks! The use case explored here is to return all the combinations by which one migh...

LevenschteinDistance - reduce.xlsx54 KB

excel

Formulas and Functions

Silver Contributor

Nov 03, 2024

djclements

This is an excellent function and a very efficient use of dynamic programming. It's interesting that even though our approaches to this task are different there are still some similarities in delivering a solution. Your solution is quite optimal.

At present, when I approach a task in Excel, I go through the options available to solve the problem: dynamic programming, recursion, thunks, and a combination of the above. Dynamic programming still rules the day.

Excel's weakness still being the lack of support for nested arrays. This is underscored by the lack of flexibility of the Lambda helper functions. The two options to work around this being MAKEARRAY and REDUCE. The downside to the former is needing to know the dimensions going in and having to 'check' each element in the return matrix. The downside to using reduction of course being the need to repeatedly stack the accumulated results.

PeterBartholomew1

This is another 3-course meal! This will keep me busy for a while. I like how this builds on the existing functions in your module. The average user needn't see any of this programming but can do some pretty slick things. The complexity being hidden.

The process is reasonably efficient, but I would prefer Microsoft were to implement a STACK function that would expand any 2D array of thunks; no calculation I could perform would be a match for a native function in compiled code.

This is a fascinating idea and probably the next best thing to full support of nested arrays. Microsoft would need to officially recognize thunks for this to happen. Might as well also make THUNK a native function!

re: flattening a 'list' or 'nested arrays'

Python does this very simply through vstack (and quite a few other functions from I gather) :

Code written by a total novice!

import itertools as it

# Set list of items to choose from and how many to take.
items = np.array(xl("Items")).tolist()    
k = xl("pick")

# Generate a list of all possible combinations.
combinations = list(it.combinations_with_replacement(items,k))

# Unpack nested array and wrap rows according to the number of items in each combination to present in sheet.
np.vstack(combinations).reshape(-1,k)

joelb95

Brass Contributor

Nov 04, 2024

Reduce merely requires a single array (of any dimension) as a return - it doesn't inherently limit the type of return (so far as I know). If you are willing to thunk thunks, your return stack can be fairly deep before you notice major slow downs. You could, for instance, turn each row of an array into a thunk and hstack them iteratively via reduce, thereby giving you a row of rows. You can write an access function along the lines of

lambda(
outer_address,
row_of_thunks,
lambda(
inner_address,
index(
index(
row_of_thunks,
1,
outer_address
)(),
1,
inner_address
)
)

In this way, thunks just become 2d buckets that could be anything at all and arrays are just the scaffolding for a basic thunk arrangement. Depending how you choose to write your functions, you can have each thunk in an array represent a different state you want to persist and pass it easily from one name scope to another.

I've also been thinking a bit about the downside of just adding everything to a single 2 day array with more explicit contained array tracking as the first few columns, that way you can set a particular array as a variable name and have it extracted from the array as you require. Outside of the inherent limits of accessing an array and deconstructing it using take/drop/choosecols/etc., there is no overhead to maintaining the array and no stacking of functions and dependency chains for the data.

Patrick2788
Silver Contributor
Dec 03, 2024
After some more testing it appears 16,380 is indeed the new operand stack limit.
To be precise:

16,380 / (Total λ param + Total LET param + 1)
Patrick2788
Silver Contributor
Nov 23, 2024
As I was testing each function and going beyond 8 million units, I noticed the recursive solution was hanging in there with Mapλ. Remember this informative article from a few years ago concerning recursion limits?
LAMBDA: Recursive Lists and Understanding Recursion Limits of LAMBDA | Microsoft Community Hub
The limit at the time being: 1024/(parameter count + 1)
I understand the Excel team quietly increased this limit (the numerator portion in the formula above) to a little over 2000 a few years ago.
I created a very rudimentary recursive function to test the current limit. Gradually adding a parameter and seeing how high it would go:
TestStack = LAMBDA(n,m,o,IF(n=4094,n,TestStack(n+1,m,o)));
The results of the test:
PeterBartholomew1
Silver Contributor
Nov 23, 2024
Those are very neat formulas. My recursion thinking largely stopped when the Lambda helper functions were released. Recursion is now the last stop saloon before giving up when I really do not know whether the number of recursive loops (is that an oxymoron?) could be 1 or 1000.
The danger with recursive stacking can be the amount of memory temporarily committed to the calculation, all of which needs to be managed by the system. Naturally I followed your speculation and tried MAPλ. I needed to Curry the EXPAND function to allow it to accept the value to be expanded from MAP.
What I finished up with was
= MAPλ(Letters, Expandλ(25, 25)) // Expands a cell value to cover an (n₁ x n₂) array Expandλ = LAMBDA(n₁, n₂, LAMBDA(x, EXPAND(x, n₁, n₂, x)));
Magnify.xlsx245 KB

Patrick2788

Silver Contributor

Nov 22, 2024

That's a fascinating approach for legacy in using CSE for everything! That's interesting how it moved you away from the fill handle. I never thought of that. I can recall spending way too much time with FREQUENCY/MATCH counting unique items and relying heavily on INDEX/SMALL-IF. I don't miss those at all!

In re: pooling layers. I stumbled on an "opposite" operation, interpolation as used in image processing. In terms of matrices:

Seems to be a good opportunity to create a function to do this for text and/or numbers. Mapλ is likely much faster than my method with numbers. The challenge was to use recursion:

//Create square matrices from each value in a vertical array.
SquareMλ =  LAMBDA(arr,x,
            LET(
                k,ROWS(arr),
                next,TAKE(arr,1),
                resized,EXPAND(next,@x,@x,next),
                acc,VSTACK(x,resized),
                discard,DROP(arr,1),
                final,DROP(acc,1),
                IF(k=1,final,SquareMλ(discard,acc))));

//Nearest-Neighbor Interpolation
Magnifyλ = LAMBDA(matrix, magnification,
            LET(
                next, TAKE(matrix, , 1),
                resized, SquareMλ(next, @magnification),
                acc, HSTACK(magnification, resized),
                discard, DROP(matrix, , 1),
                k, COLUMNS(matrix),
                final, DROP(acc, , 1),
                IF(k = 1, final, Magnifyλ(discard, acc))
        )
    );

Magnify.xlsx237 KB

PeterBartholomew1
Silver Contributor
Nov 22, 2024
Patrick2788
On your point concerning CSE formula, and speaking as a freak user that had schooled himself to enter all formulas using CSE, it is my opinion that the prior experience was a help rather than a hindrance in that case. I had broken the 'and send down' mentality that approaches lists and arrays as collections of scalar formulas. CSE was turgid though, and dynamic arrays came as a huge relief!
The other trick I used, especially with array formulas that return a single output, was to place formulas within defined names to avoid the implicit intersection that made such a mess of grid-based formulas.
The new methods bring with them a requirement for fresh learning. It is sometimes harder to identify the familiar knowledge that can safely be consigned to the trash-can of history! Anyone want to know how to divide weights expressed as st-lb-oz by 17 or perhaps extract square roots by hand? 😱
Patrick2788
Silver Contributor
Nov 04, 2024
joelb95
This is interesting because I've been considering doing something like this for a function I'm working on. The function would essentially be XLOOKUP with the capability of returning a matrix. I've created a function previously with some basic recursion to do this but the new function would take things further with the goal of optimizing calculation time as much as possible to handle very large sets of data. For this reason the function would default to a binary search to avoid the "slowness" of a linear search. There's also the matter of sorting the arrays to accommodate the binary lookup. The function would utilize some recursion and maybe some thunks. A lot of testing would need to be done.
I'll need to set aside some time to study your contributions, too. There's a nice variety in authorship styles on this forum and it's interesting to see how others think.

PeterBartholomew1
Thank you for sharing those links. I had not heard of Wilkinson before checking out those links. He's in rare company with those who have won the Turing award and/or von Neumann prize. Wilkinson's advantage made me think if there is a parallel with Excel. Do those who've studied CSE arrays and worked with arrays prior to even the advent of SUMIFS,COUNTIFS, AVERAGEIFS, etc. have an advantage when working with Excel 365? 3 years ago I would say yes but some of the older methods have become obsolete. Someone new to Excel can jump in with dynamic arrays and not waste a moment on three-finger formulas! I'm beginning to wonder if even COUNTIF/SUMIF, etc. are no longer relevant. GROUPBY can handle those tasks and I find deploying COUNTIF/SUMIF, etc. through Lambda helpers is not usually efficient.