A statement of truth or a wasted opportunity. Lambda helper functions reluctant to return 2D arrays.

Silver Contributor

I remember @JoeMcDaid stating that 2D arrays are the only data types held in Excel; rows columns and scalars are merely special cases with the last being a 1x1 array.

image.png

So why is it such a problem bringing related rows together the form a 2D array?  There are a number of tricks and workarounds, but should that be necessary?  By way of example I show an old data file with the individual results from a cross-country championships.

image.png

(first 20 of 300)

The exercise is to rank the teams by adding the positions of the first 4 of their 8 runners to get a point count (lowest wins).  What I require is an output with the teams in result order showing the names of the 4 scoring members of the team.

image.png

The formula looks to be a monster at first sight

 

 

= LET(
  \0, "Extract source data",
      pos,  --LEFT(rawdata,7),
      name, (MID(rawdata,8,22)),
      team, TRIM(MID(rawdata,31,26)),
      time,--("00:"&RIGHT(rawdata,5)),
      teamList, UNIQUE(team),
  \1, "Build Lambda functions",
      SCOREλ, LAMBDA(t,
         LET(
            pts, FILTER(pos, team=t),
            SUM(INDEX(pts,{1,2,3,4}))
         )),
      MEMBERSλ, LAMBDA(t,
         LET(
            m, FILTER(name, team=t),
            CONCAT(m)
         )),
  \2, "Identify scoring members of teams in order",
      teamScore, MAP(teamList, SCOREλ),
      teamOrder, SORTBY(teamList, teamScore),
      scoring, MAP(teamOrder, MEMBERSλ),
      MID(scoring,{1,23,45,67},22))

 

 

but the section following name '\0' and the text comment is simply extracting individual fields from the source data and could easily be separated off within a helper range.  Similarly the lines following '\1' defines named Lambda functions to calculate the team score given the name of a single team and to generate the array of scoring competitors.

The meat of the calculation is at the end: the teams are scored and ranked by score.  The penultimate line assemble the list of scoring members for each team in order.  My complaint is that each team cannot be treated as an array without triggering an error message.  Instead I have had to concatenate the names to product a column of results.  Only then can I split the column back to give the array I need.   

There are other approaches such as that shown by @tboulden in 

Array of arrays using Lambda helper functions - Microsoft Tech Community

which used REDUCE and a sequence of VSTACK operations.

 

Does the restriction on arrays of arrays serve any purpose given that the spreadsheet is all about the manipulation of 2D arrays?  What do you think? 

@Sergei Baklan 

@lori_m 

@Chris Gross 

69 Replies

@lori_m  Ahh, thanks for the clarification, I had never noticed the areas param in INDEX; I've confirmed both constructions, MAP across an area index and MAP across the EVALUATEd array of ranges.

 

I couldn't think of a real-world specific use-case for the REDUCE/union construction, but set up a table and unioned all rows and all columns separately, then intersected them. Was surprised to see that its non-commutative in terms of order, but also makes sense in retrospect.

Screenshot 2021-09-26 095124.png

@lori_mSorry, the prior post ate my image for the MAP constructions.

Screenshot 2021-09-26 100052.png

@tboulden 

I agree unions of ranges have limited application; arrays of ranges on the other hand can be more useful as more functions allow them, eg attachment uses formula below for referencing 4D data:

=INDEX(INDEX(SubArrays,i,j),k,l)

I guess the equivalent for thunks could be:

=INDEX(INDEX(Thunks,i,j)(),k,l)

 

Also included is a sample implementation of Peter's idea from earlier in the thread of adding names based on a LET function. I think this is quite an intuitive way of managing names in general not just LAMBDAs, I might see about distributing the code as an addin so any workbook code can be removed.

lori_m_1-1632999113453.png

 

@lori_m 

Too easy to forget attachments!

@lori_mI forgot briefly that this was an xlsm and wondered what voodoo was happening! This is pretty slick!

@tboulden 

It took a bit of time to figure out the formula parsing, it's not 100% foolproof yet, but as a proof-of-concept it works better than expected.

I think the GLOBAL/LOCAL functions that were suggested earlier have potential to further improve the UI by adding the validation automatically - which looks possible using Evaluate - I'lll have to wait for LAMBDA for that though. Anyway full credit to @Peter Bartholomew  for speccing it out!

Very interesting! I don't know about VBA Evaluate limitations, but I know I ran into the 256 char limit when using EVALUATE via Name Manager; is that char limit a concern?
String length shouldn't be a factor here. The trick is to call another VBA function indirectly from within a UDF e.g. Evaluate("AddValidation()") - also see the earlier code example. This appears to work with validation and leaves undo intact in basic testing.

@lori_m 

I have been somewhat quiet of late!  I was struggling a bit sorting out a 'recommended approach' for using thunks from the @tboulden experimentation.  I need an idiot's guide to follow!

 

As for your coding, it is so full of unfamiliar techniques that it looks like magic!  I think it was Arthur C Clark who had something to say about any sufficiently advanced technology being indistinguishable from magic.  It took me a while to recognise the UDFs, tucked away in LET since, without any arguments, they didn't look like functions; finding the subroutine call was even more of a challenge.  I had no idea that was possible to call subs like that and it seems to bypass all the constraints that would be placed upon a UDF.  I have a vague recollection of seeing hyperlinks being used to achieve a similar objective.

 

My original thought was to use the Worksheet Change event to run the VBA code but your approach seems to offer advantages.  I also like the way you have parsed the cell formula in the 'args' function, I had mentally overcomplicated the task by thinking in terms of Regular Expressions.

 

Right now, I have some conference papers to review, and hopefully approve, but I will try to put the pieces together, either following my idea of GLOBAL/LOCAL or perhaps using your 'scope' parameter to combine them as DEFINE (say).  I am hopeful that this will serve to make some of our very long formulas more compact by separating out the definition of key elements of the formula to defined names yet, at the same time, provide documentation and traceability on the worksheet rather than tucked away in Name Manager which appears to be widely despised.

@Peter Bartholomew 

A substantial amount of trial and error was required as we're relying on undocumented functionality to achieve this objective. Things like adding names via validation and calling a Sub from a name I only learned through this exercise, however there were many other commands that didn't work as expected when called this way.

 

As the basis of a 'DEFINE' function, I did have some success adding the validation automatically using =IF(COUNTA(AddValidation()),Result) where 'Result' would refer to a LAMBDA expression similar to your earlier suggestion. I hope to pick this up again later when lambda reaches the current channel if someone else doesn't pick the baton up before then.

 

Since we've veered a little off topic here, it might be worth raising a new thread for name management in general to continue discussion of this and other ideas to improve user interaction with names. I'll look at putting some notes together when time permits.