Forum Discussion
A generalised Lambda helper function that return arrays of arrays using bisection.
Given we now have a few different implementations of BYROW available that will support 2D-arrays of results, I thought it might be of interest to compare some of these approaches by consolidating into a gist: byrow · GitHub
In some basic tests using an input array SEQUENCE(10000,10) and sorting columns in descending order, results suggested recursive bisection was fastest (0.3s) followed by 3-level nested REDUCE/VSTACK (0.5s), and the BYROW/Thunk method - which I think is quite similar to your approach above - was not far behind (1.1s).
As an alternative to thunks, one might also consider 'records' or 'tuples' that are essentially indexed lambdas that may contain any type of data including arrays and lambdas. For example a Fibonacci sequence implementation might be,
=BYROW(
0,
SCAN(
LAMBDA(i, i),
SEQUENCE(1000),
LAMBDA(a, i, LET(x, a(0), y, a(1), LAMBDA(i, x * i + y)))
)
)
I believe building an indexed lambda as accumulator in this way could be one way to extend the REDUCE/VSTACK beyond three levels - though admittedly this has been quite a struggle so far!
I have been away from home today so have yet to really get to grips with the concepts you are following. Firstly 'how?' and then 'why?'
Similarly there are joelb95 's ideas on 'stuffing' which I need to get to grips with.
Thunking vs. Stuffing - Is there a difference? - Microsoft Community Hub
Meanwhile I have continued to apply the Lambda functions that extend standard helper functions by automatically modifying the enclosed Lambda functions to return thunks rather than arrays. Once one has the array of thunks, the derivation matters little and I expand it using a binary tree using WRAPCOLS and MAP to bring thunks together pairwise. The idea of using array manipulation functions to operate upon arrays of thunks still seems to be pushing one's luck, but so far, so good!
Anyway JoeMcDaid has recently announced RegEx functionality within Excel so I set out to test it by modifying a version based upon VBA UDFs to implement regular expressions. I had already used thunks to return the entire dataset of Danish street addresses as a single array, so the new version of the workbook served to test the new functionality and the use of thunks [with permission from Hans Knudsen ].
The result is a 6401 x 6 array of text values which looks pretty unremarkable (as it should) until one sees the RegEx function feeding capture groups into a thunk array.
SergeiBaklan The file contains the thunk variant of BYROW but the RegEx might be equally relevant within your broad range of interests.
- joelb95May 22, 2024Brass Contributor
And in case anyone wants a bit more detail on the background tasks - here is an implementation of the main stuffing/encapsulation logic and a demonstration of stacking arrays of arrays with padding.
// stuffing for dynamic number of elements data passing stuff = lambda(array, lambda(x, choose(x, array))); unstuff = lambda(stuffed_array, stuffed_array(1)); //encap for static number of elements parameter passing encap = lambda( num_elements, p_1, [p_2], [p_3], [p_4], [p_5], [p_6], [p_7], [p_8], [p_9], [p_10], [p_11], [p_12], [p_13], [p_14], [p_15], [p_16], [p_17], [p_18], [p_19], [p_20], [p_21], [p_22], [p_23], [p_24], [p_25], [p_26], [p_27], [p_28], [p_29], [p_30], [p_31], [p_32], [p_33], [p_34], [p_35], [p_36], [p_37], [p_38], [p_39], [p_40], lambda( choice, choose( choice+1, num_elements, stuff(p_1), stuff(p_2), stuff(p_3), stuff(p_4), stuff(p_5), stuff(p_6), stuff(p_7), stuff(p_8), stuff(p_9), stuff(p_10), stuff(p_11), stuff(p_12), stuff(p_13), stuff(p_14), stuff(p_15), stuff(p_16), stuff(p_17), stuff(p_18), stuff(p_19), stuff(p_20), stuff(p_21), stuff(p_22), stuff(p_23), stuff(p_24), stuff(p_25), stuff(p_26), stuff(p_27), stuff(p_28), stuff(p_29), stuff(p_30), stuff(p_31), stuff(p_32), stuff(p_33), stuff(p_34), stuff(p_35), stuff(p_36), stuff(p_37), stuff(p_38), stuff(p_39), stuff(p_40) ) ) ); // for vstacking or hstacking arrays that are contained in a unidimensional array (unitary cell, single row, or single column). Variations for encapsulated rows/columns also exist. colStuffedElements.unstuffAndStack = lambda( encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse, let( col_stuffed_elements, encap.unstuffFirstElement(encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse), padding_col_or_row, encap.unstuffSecondElement(encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse), hstack_bool, encap.unstuffThirdElement(encapthree_colStuffedElements_paddingColOrRowInt_hstackBoolDefaultFalse), if( hstack_bool, colStuffedElements.unstuffAndHSTACK(col_stuffed_elements,padding_col_or_row), colStuffedElements.unstuffAndVSTACK(col_stuffed_elements,padding_col_or_row) ) ) ); colStuffedElements.unstuffAndHSTACK = lambda( colOfStuffedElements, [seperator_cols], let( seperator_pad, EXPAND("", , seperator_cols, ""), pad_original, IF( or(isomitted(seperator_cols), seperator_cols = 0), LAMBDA(padded_orig, expandTo, padded_orig), LAMBDA(padded_orig, expandTo, HSTACK(padded_orig, EXPAND(seperator_pad, expandTo, , ""))) ), result, DROP( REDUCE( "", SEQUENCE(rows(colOfStuffedElements)), LAMBDA(acc, cur, LET( original, acc, new, unstuff(index(colOfStuffedElements,cur,1)), expand_to, MAX(ROWS(original), ROWS(new)), expanded_original, EXPAND(original, expand_to, , ""), padded_orig, pad_original(expanded_original, expand_to), HSTACK(padded_orig, EXPAND(new, expand_to, , "")) ) ) ), 0,(1+seperator_cols) ), result ) ); colStuffedElements.unstuffAndVSTACK = lambda( colOfStuffedElements, [seperator_rows], let( seperator_pad, EXPAND("", seperator_rows, , ""), pad_original, IF( or(isomitted(seperator_rows), seperator_rows = 0), LAMBDA(padded_orig, expandTo, padded_orig), LAMBDA(padded_orig, expandTo, VSTACK(padded_orig, EXPAND(seperator_pad, , expandTo, ""))) ), result, DROP( REDUCE( "", SEQUENCE(ROWS(colOfStuffedElements)), LAMBDA(acc, cur, LET( original, acc, new, unstuff(index(colOfStuffedElements,cur,1)), expand_to, MAX(COLUMNS(original), COLUMNS(new)), expanded_original, EXPAND(original, , expand_to, ""), padded_orig, pad_original(expanded_original, expand_to), VSTACK(padded_orig, EXPAND(new, , expand_to, "")) ) ) ), (1 + seperator_rows) ), result ) );
- joelb95May 22, 2024Brass Contributor
I am working on a different problem at the moment, but here is my general approach to some array manipulation via loops. There are some background functions needed to make this work, but the overall theory/structure should be there to follow. The parameterization of arguments isn't necessary, it is just a consequence of my overall design pattern.
/ loops forList = lambda( forList_args, let( stack_bool, encap.unstuffSelected(forList_args,3), stuffed, forList.stuff(forList_args), result, if( stack_bool, colStuffedElements.unstuffAndStack( encapthree( stuffed, encap.unstuffSelected(forList_args,5), encap.unstuffSelected(forList_args,4) ) ), stuffed ), result ) ); forList.args = lambda( array, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], let( adj_array, take(ifs(arr.isColLike(array), array, arr.isSingleRow(array), transpose(array),1,index("Invalid array provided to forList.args.",1,1)),,1), stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, adj_array, function, stack_bool, hstack_bool, padding_rows_cols) ) ); forList.stuff = lambda( forList_args, let( array, encap.unstuffSelected(forList_args,1), function, encap.unstuffSelected(forList_args,2), result, map(array, lambda(ele, stuff(function(ele)))), result ) ); forRange = lambda( forRange_args, forList(forRange_args) ); forRange.args = lambda( no_of_loops, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], [start_no_df_1], let( stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), start_no, if(isomitted(start_no_df_1), 1, start_no_df_1), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, sequence(no_of_loops,1,start_no), function, stack_bool, hstack_bool, padding_rows_cols) ) ); forRows = lambda( forRows_args, let( stack_bool, encap.unstuffSelected(forRows_args,3), stuffed, forRows.stuff(forRows_args), result, if( stack_bool, colStuffedElements.unstuffAndStack( encapthree( stuffed, encap.unstuffSelected(forRows_args,5), encap.unstuffSelected(forRows_args,4) ) ), stuffed ), result ) ); forCols = lambda( forCols_args, let( stack_bool, encap.unstuffSelected(forCols_args,3), stuffed, transpose(forCols.stuff(forCols_args)), result, if( stack_bool, colStuffedElements.unstuffAndStack( encapthree( stuffed, encap.unstuffSelected(forCols_args,5), encap.unstuffSelected(forCols_args,4) ) ), stuffed ), result ) ); forRows.args = lambda( array, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], let( stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, array, function, stack_bool, hstack_bool, padding_rows_cols) ) ); forRows.stuff = lambda( forRows_args, let( array, encap.unstuffSelected(forRows_args,1), function, encap.unstuffSelected(forRows_args,2), result, byrow(array, lambda(row, stuff(function(row)))), result ) ); forCols.args = lambda( array, function, [stack_bool_df_false], [hstack_bool_df_false], [padding_rows_cols_int_df_0], let( stack_bool, if(isomitted(stack_bool_df_false),FALSE,stack_bool_df_false), hstack_bool, if(isomitted(hstack_bool_df_false), FALSE, hstack_bool_df_false), padding_rows_cols, if(isomitted(padding_rows_cols_int_df_0), 0, padding_rows_cols_int_df_0), encap(5, array, function, stack_bool, hstack_bool, padding_rows_cols) ) ); forCols.stuff = lambda( forCols_args, let( array, encap.unstuffSelected(forCols_args,1), function, encap.unstuffSelected(forCols_args,2), result, bycol(array, lambda(col, stuff(function(col)))), result ) );
- lori_mMay 22, 2024Iron Contributor
Great news about new regex functions! I had not been aware of this development. And I like the use of shaping functions with arrays of thunks, something along these lines was included in one of the functions in the gist.
I hope to be able to revert regarding the other topics raised but briefly for now relating to How/Why of lambda constructions,
How?
Consider the following two implementations where parameters may be values, arrays, lambdas, etc.
Thunk = LAMBDA(HSTACK(a,b,c,d)) Tuple = LAMBDA(i,CHOOSE(i,a,b,c,d))
In the Fibonacci example we have a 2-tuple or pair (x, y) that maps to (y, x+y) with initial value (0,1). The mapping is represented as i*x+y (i=0,1) or equivalently IF(i, x+y, y) with initial values x=0, y=1.
Why?
Both constructions can be nested to work around array of arrays. I've got less experience with thunks than you but my impressions are both have pros/cons including,
Thunks
- Easily modified / reshaped by updating the array and wrapping in lambda
- Extendable to longer lists via stacking functions.Tuples/records:
- Arrays stored individually e.g. an accumulator can be built containing several variables.
- Efficient retrieval of data since IF/CHOOSE only evaluate selected arguments (versus indexing an arrays of thunks which can lead to inefficiencies)Perhaps one could make a similar distinction between lists (mutable) vs tuples (immutable) in some other languages. I guess that other link you mention might be related though the terminology looks non-standard - at least to me.
- SergeiBaklanMay 22, 2024Diamond Contributor
Thank you for sharing. Yes, it's interesting to play with RegEx, but that's only to add some experience. Without day-to-day practice it's bit hard to work with it. By the way, in rows 258 and 364 parsing is not correct.
Your BYROW/thunks is much more interesting.
Anyway, need some time to play with the file.