Forum Discussion
A generalised Lambda helper function that return arrays of arrays using bisection.
The same challenge but a somewhat different solution. Instead of selecting thunks to combine using INDEX with arguments generated by the binary tree I used array shaping functions WRAPROWS and TAKE to pair up thunks for combining. It still uses a binary tree but REDUCE is used to work through the levels until the entire solution is held as a single array.
This is the main lambda function BYROWλ (made a bit more verbose by the change control block and runtime documentation)
/* FUNCTION NAME: BYROWλ
DESCRIPTION: Implements a version of BYROW that will return an array of arrays */
/* REVISIONS: Date Developer Description
09 May 2024 Peter Bartholomew Original Developemnt
*/
BYROWλ = LAMBDA(
// Parameter Declarations
array,fnλ,
// Procedure
LET(help, TRIM(TEXTSPLIT(
"DESCRIPTION: →Implements a version of BYROW that will return an array of arrays.¶" &
"VERSION: →09 May 2004 ¶" &
"PARAMETERS: →¶" &
" array →(Required) A two dimensional array of values. ¶" &
" fnλ →(Required) A lambda function that accepts a row array as input.",
"→", "¶" )
),
// Check inputs
array, IF(OR(ISOMITTED(array), array=""), #Value!, array),
fnλ, IF(OR(ISOMITTED(fnλ), TYPE(fnλ)<>128), #Value!, fnλ),
// Procedure
n, ROWS(array),
// Implements BYROW returning thunked results at each step
thunkArrayϑ, BYROW(array, LAMBDA(row, THUNK(fnλ(row)))),
// Recombine pairs of thunks as a binary tree until the root node is reached
k, SEQUENCE(LEN(BASE(n - 1, 2))),
recombinedϑ, @REDUCE(thunkArrayϑ, k, JOINPAIRSλ),
result, TAKE(recombinedϑ(), n),
// Handle Error
error, MAX(IsError(result)+1),
// Return result
CHOOSE(error, result, Help)
)
);
It calls a further function to perform the pairwise stacking of the contents of thunks.
/* FUNCTION NAME: JOINPAIRSλ
DESCRIPTION: Called by BYROW to stack the contents of thunks pairwise */
/* REVISIONS: Date Developer Description
09 May 2024 Peter Bartholomew Original Developemnt
*/
JOINPAIRSλ
= LAMBDA(thunkArray, [k],
LET(
alternate, WRAPROWS(thunkArray, 2, thunk("")),
firstpart, TAKE(alternate, , 1),
finalpart, TAKE(alternate, , -1),
MAP(
firstpart,
finalpart,
LAMBDA(ϑ₁, ϑ₂, LAMBDA(VSTACK((@ϑ₁)(), (@ϑ₂)())))
)
)
);
The tricky part was ensuring that the single thunks did not return type 64 (array) but were coerced to type 128 (lambda function).
I have tested the functions up to 8192 rows and, although the performance wasn't sparkling, it only took a few seconds.
Given we now have a few different implementations of BYROW available that will support 2D-arrays of results, I thought it might be of interest to compare some of these approaches by consolidating into a gist: byrow · GitHub
In some basic tests using an input array SEQUENCE(10000,10) and sorting columns in descending order, results suggested recursive bisection was fastest (0.3s) followed by 3-level nested REDUCE/VSTACK (0.5s), and the BYROW/Thunk method - which I think is quite similar to your approach above - was not far behind (1.1s).
As an alternative to thunks, one might also consider 'records' or 'tuples' that are essentially indexed lambdas that may contain any type of data including arrays and lambdas. For example a Fibonacci sequence implementation might be,
=BYROW(
0,
SCAN(
LAMBDA(i, i),
SEQUENCE(1000),
LAMBDA(a, i, LET(x, a(0), y, a(1), LAMBDA(i, x * i + y)))
)
)
I believe building an indexed lambda as accumulator in this way could be one way to extend the REDUCE/VSTACK beyond three levels - though admittedly this has been quite a struggle so far!
- joelb95May 22, 2024Brass Contributor
And in case anyone wants a bit more detail on the background tasks - here is an implementation of the main stuffing/encapsulation logic and a demonstration of stacking arrays of arrays with padding.
// stuffing for dynamic number of elements data passing stuff = lambda(array, lambda(x, choose(x, array))); unstuff = lambda(stuffed_array, stuffed_array(1)); //encap for static number of elements parameter passing encap = lambda( num_elements, p_1, [p_2], [p_3], [p_4], [p_5], [p_6], [p_7], [p_8], [p_9], [p_10], [p_11], [p_12], [p_13], [p_14], [p_15], [p_16], [p_17], [p_18], [p_19], [p_20], [p_21], [p_22], [p_23], [p_24], [p_25], [p_26], [p_27], [p_28], [p_29], [p_30], [p_31], [p_32], [p_33], [p_34], [p_35], [p_36], [p_37], [p_38], [p_39], [p_40], lambda( choice, choose( choice+1, num_elements, stuff(p_1), stuff(p_2), stuff(p_3), stuff(p_4), stuff(p_5), stuff(p_6), stuff(p_7), stuff(p_8), stuff(p_9), stuff(p_10), stuff(p_11), stuff(p_12), stuff(p_13), stuff(p_14), stuff(p_15), stuff(p_16), stuff(p_17), stuff(p_18), stuff(p_19), stuff(p_20), stuff(p_21), stuff(p_22), stuff(p_23), stuff(p_24), stuff(p_25), stuff(p_26), stuff(p_27), stuff(p_28), stuff(p_29), stuff(p_30), stuff(p_31), stuff(p_32), stuff(p_33), stuff(p_34), stuff(p_35), stuff(p_36), stuff(p_37), stuff(p_38), stuff(p_39), stuff(p_40) ) ) ); // for vstacking or hstacking arrays that are contained in a unidimensional array (unitary cell, single row, or single column). Variations for encapsulated rows/columns also exist. colStuffedElements.unstuffAndStack = lambda( encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse, let( col_stuffed_elements, encap.unstuffFirstElement(encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse), padding_col_or_row, encap.unstuffSecondElement(encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse), hstack_bool, encap.unstuffThirdElement(encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse), if( hstack_bool, colStuffedElements.unstuffAndHSTACK(col_stuffed_elements,padding_col_or_row), colStuffedElements.unstuffAndVSTACK(col_stuffed_elements,padding_col_or_row) ) ) ); colStuffedElements.unstuffAndHSTACK = lambda( colOfStuffedElements, [seperator_cols], let( seperator_pad, EXPAND("", , seperator_cols, ""), pad_original, IF( or(isomitted(seperator_cols), seperator_cols = 0), LAMBDA(padded_orig, expandTo, padded_orig), LAMBDA(padded_orig, expandTo, HSTACK(padded_orig, EXPAND(seperator_pad, expandTo, , ""))) ), result, DROP( REDUCE( "", SEQUENCE(rows(colOfStuffedElements)), LAMBDA(acc, cur, LET( original, acc, new, unstuff(index(colOfStuffedElements,cur,1)), expand_to, MAX(ROWS(original), ROWS(new)), expanded_original, EXPAND(original, expand_to, , ""), padded_orig, pad_original(expanded_original, expand_to), HSTACK(padded_orig, EXPAND(new, expand_to, , "")) ) ) ), 0,(1+seperator_cols) ), result ) ); colStuffedElements.unstuffAndVSTACK = lambda( colOfStuffedElements, [seperator_rows], let( seperator_pad, EXPAND("", seperator_rows, , ""), pad_original, IF( or(isomitted(seperator_rows), seperator_rows = 0), LAMBDA(padded_orig, expandTo, padded_orig), LAMBDA(padded_orig, expandTo, VSTACK(padded_orig, EXPAND(seperator_pad, , expandTo, ""))) ), result, DROP( REDUCE( "", SEQUENCE(ROWS(colOfStuffedElements)), LAMBDA(acc, cur, LET( original, acc, new, unstuff(index(colOfStuffedElements,cur,1)), expand_to, MAX(COLUMNS(original), COLUMNS(new)), expanded_original, EXPAND(original, , expand_to, ""), padded_orig, pad_original(expanded_original, expand_to), VSTACK(padded_orig, EXPAND(new, , expand_to, "")) ) ) ), (1 + seperator_rows) ), result ) );
- joelb95May 22, 2024Brass Contributor
I am working on a different problem at the moment, but here is my general approach to some array manipulation via loops. There are some background functions needed to make this work, but the overall theory/structure should be there to follow. The parameterization of arguments isn't necessary, it is just a consequence of my overall design pattern.
/ loops forList = lambda( forList_args, let( stack_bool, encap.unstuffSelected(forList_args,3), stuffed, forList.stuff(forList_args), result, if( stack_bool, colStuffedElements.unstuffAndStack( encapthree( stuffed, encap.unstuffSelected(forList_args,5), encap.unstuffSelected(forList_args,4) ) ), stuffed ), result ) ); forList.args = lambda( array, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], let( adj_array, take(ifs(arr.isColLike(array), array, arr.isSingleRow(array), transpose(array),1,index("Invalid array provided to forList.args.",1,1)),,1), stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, adj_array, function, stack_bool, hstack_bool, padding_rows_cols) ) ); forList.stuff = lambda( forList_args, let( array, encap.unstuffSelected(forList_args,1), function, encap.unstuffSelected(forList_args,2), result, map(array, lambda(ele, stuff(function(ele)))), result ) ); forRange = lambda( forRange_args, forList(forRange_args) ); forRange.args = lambda( no_of_loops, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], [start_no_df_1], let( stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), start_no, if(isomitted(start_no_df_1), 1, start_no_df_1), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, sequence(no_of_loops,1,start_no), function, stack_bool, hstack_bool, padding_rows_cols) ) ); forRows = lambda( forRows_args, let( stack_bool, encap.unstuffSelected(forRows_args,3), stuffed, forRows.stuff(forRows_args), result, if( stack_bool, colStuffedElements.unstuffAndStack( encapthree( stuffed, encap.unstuffSelected(forRows_args,5), encap.unstuffSelected(forRows_args,4) ) ), stuffed ), result ) ); forCols = lambda( forCols_args, let( stack_bool, encap.unstuffSelected(forCols_args,3), stuffed, transpose(forCols.stuff(forCols_args)), result, if( stack_bool, colStuffedElements.unstuffAndStack( encapthree( stuffed, encap.unstuffSelected(forCols_args,5), encap.unstuffSelected(forCols_args,4) ) ), stuffed ), result ) ); forRows.args = lambda( array, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], let( stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, array, function, stack_bool, hstack_bool, padding_rows_cols) ) ); forRows.stuff = lambda( forRows_args, let( array, encap.unstuffSelected(forRows_args,1), function, encap.unstuffSelected(forRows_args,2), result, byrow(array, lambda(row, stuff(function(row)))), result ) ); forCols.args = lambda( array, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], let( stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, array, function, stack_bool, hstack_bool, padding_rows_cols) ) ); forCols.stuff = lambda( forCols_args, let( array, encap.unstuffSelected(forCols_args,1), function, encap.unstuffSelected(forCols_args,2), result, bycol(array, lambda(col, stuff(function(col)))), result ) );
- lori_mMay 22, 2024Iron Contributor
Great news about new regex functions! I had not been aware of this development. And I like the use of shaping functions with arrays of thunks, something along these lines was included in one of the functions in the gist.
I hope to be able to revert regarding the other topics raised but briefly for now relating to How/Why of lambda constructions,
How?
Consider the following two implementations where parameters may be values, arrays, lambdas, etc.
Thunk = LAMBDA(HSTACK(a,b,c,d)) Tuple = LAMBDA(i,CHOOSE(i,a,b,c,d))
In the Fibonacci example we have a 2-tuple or pair (x, y) that maps to (y, x+y) with initial value (0,1). The mapping is represented as i*x+y (i=0,1) or equivalently IF(i, x+y, y) with initial values x=0, y=1.
Why?
Both constructions can be nested to work around array of arrays. I've got less experience with thunks than you but my impressions are both have pros/cons including,
Thunks
- Easily modified / reshaped by updating the array and wrapping in lambda
- Extendable to longer lists via stacking functions.Tuples/records:
- Arrays stored individually e.g. an accumulator can be built containing several variables.
- Efficient retrieval of data since IF/CHOOSE only evaluate selected arguments (versus indexing an arrays of thunks which can lead to inefficiencies)Perhaps one could make a similar distinction between lists (mutable) vs tuples (immutable) in some other languages. I guess that other link you mention might be related though the terminology looks non-standard - at least to me.
- SergeiBaklanMay 22, 2024Diamond Contributor
Thank you for sharing. Yes, it's interesting to play with RegEx, but that's only to add some experience. Without day-to-day practice it's bit hard to work with it. By the way, in rows 258 and 364 parsing is not correct.
Your BYROW/thunks is much more interesting.
Anyway, need some time to play with the file.
- PeterBartholomew1May 21, 2024Silver Contributor
I have been away from home today so have yet to really get to grips with the concepts you are following. Firstly 'how?' and then 'why?'
Similarly there are joelb95 's ideas on 'stuffing' which I need to get to grips with.
Thunking vs. Stuffing - Is there a difference? - Microsoft Community Hub
Meanwhile I have continued to apply the Lambda functions that extend standard helper functions by automatically modifying the enclosed Lambda functions to return thunks rather than arrays. Once one has the array of thunks, the derivation matters little and I expand it using a binary tree using WRAPCOLS and MAP to bring thunks together pairwise. The idea of using array manipulation functions to operate upon arrays of thunks still seems to be pushing one's luck, but so far, so good!
Anyway JoeMcDaid has recently announced RegEx functionality within Excel so I set out to test it by modifying a version based upon VBA UDFs to implement regular expressions. I had already used thunks to return the entire dataset of Danish street addresses as a single array, so the new version of the workbook served to test the new functionality and the use of thunks [with permission from Hans Knudsen ].
The result is a 6401 x 6 array of text values which looks pretty unremarkable (as it should) until one sees the RegEx function feeding capture groups into a thunk array.
SergeiBaklan The file contains the thunk variant of BYROW but the RegEx might be equally relevant within your broad range of interests.
- PeterBartholomew1May 20, 2024Silver Contributor
Wow! That is pretty contorted! SCAN appears to be building Lambda functions, initialised as a function that returns its argument as its value. Where it goes from there might take me a while, though I can see the Fibonacci recurrence formula.
I too have been playing with the Fibonacci sequence but with the objective of creating a generic wrapper function SCANVλ that will turn an ordinary Lambda function that performs a single step of the calculation into one which will thunk the array result. Since the thunked result initialises the next step, init, too has to be thunked before being passed to the embedded SCAN helper function.
Once one has an array of thunks it no longer matters how it was generated. To work the binary tree recombination step I use WRAPROWS and MAP on the thunk array/s.
Worksheet formula = SCANVλ(init, seq, FIBONACCIλ)
The pairwise combining of thunks is based upon
/* FUNCTION NAME: JOINPAIRSλ DESCRIPTION: Called by BYROW to stack the contents of thunks pairwise */ /* REVISIONS: Date Developer Description 09 May 2024 Peter Bartholomew Original Development 16 May 2024 Peter Bartholomew Test for unpaired thunk */ JOINPAIRSλ = LAMBDA(thunkArray, [k], LET( alternate, WRAPROWS(thunkArray, 2, thunk("")), firstpart, TAKE(alternate, , 1), finalpart, TAKE(alternate, , -1), MAP( firstpart, finalpart, LAMBDA(ϑ₁, ϑ₂, LET( x₁, (@ϑ₁)(), x₂, (@ϑ₂)(), v, IF(@x₂="", x₁, VSTACK(x₁, x₂)), LAMBDA(v) ) ) ) ) );