SOLVED

Array of arrays using Lambda helper functions

Silver Contributor

Am I missing something?  What I want to do is create an array of filtered lists with a single dynamic array formula.  The output should be in the form of a 2D array with the shorter lists padded with blanks to match the longest list.

 

The workbook is a demo book I used when dynamic arrays first appeared.  It lists athletes by heat and lane numbers and what I want is a list of countries represented in the event and, for each, a list of athletes that have qualified for the event.

image.png

I was hoping to use MAP to run through the countries but I was unable to get around the #CALC! errors without 'rolling my own' recursive Lambda function; a task that was too much like serious programming for my taste.  My named Lambda function TEAMLISTλ refers to

= LAMBDA(idx,
    LET(
      country,INDEX(ListCountry#,idx), 
      athletes, TRANSPOSE(
         FILTER(StartList[ATHLETE], StartList[COUNTRY]=country)),
      return, IF(idx>1, TEAMLISTλ(idx-1),""),
      IF(idx>1, VSTACKλ("",return,athletes), athletes)
    )
  )

  where VSTACKλ is called to append a new row to an existing array and refers to

= LAMBDA(null,A,v,
    MAKEARRAY(1+ROWS(A),MAX(COLUMNS(v),COLUMNS(A)),
      LAMBDA(r,c, 
        IFERROR(IF(r<1+ROWS(A), ""&INDEX(A,r,c), ""&INDEX(v,c)),"")
      )
    )
  )

Note: If I had specified that each list of athletes should be concatenated, life would be so much easier.

image.png

That would simply require

= MAP(ListCountry#,
    LAMBDA(Country,
      TEXTJOIN(", ",,
        FILTER(StartList[ATHLETE],StartList[COUNTRY]=Country)
      )
    )
  )

as a worksheet formula with no hidden code within named Lambda functions.

@Sergei Baklan@tboulden 

 

13 Replies

@Peter BartholomewNot sure that you can do it with MAP, but REDUCE + VSTACK gets close?

 

=REDUCE("",ListCountry#,LAMBDA(acc,val,VSTACKλ(,acc,TRANSPOSE(FILTER(StartList[ATHLETE],StartList[COUNTRY]=val)))))

best response confirmed by Peter Bartholomew (Silver Contributor)
Solution

@Peter BartholomewJust to get rid of that pesky blank row:

 

=REDUCE(
    "",
    ListCountry#,
    LAMBDA(acc,val,
        IF(
            TYPE(acc)<>64,TRANSPOSE(FILTER(StartList[ATHLETE],StartList[COUNTRY]=val)),
             VSTACKλ(,acc,TRANSPOSE(FILTER(StartList[ATHLETE],StartList[COUNTRY]=val)))
        )
    )
)

 

I think since MAP will iterate over an array of any shape, the results have to fit back into the same shape; I'd be interested to know if this is a mistaken assumption on my part.

@tboulden 

Thank you so much for the reply and the excellent work you present.  REDUCE was the next thing I was going to try but only after I had hit a dead end with MAP.  I decided to pause to see what ideas you and others might have (and it was coming up to midnight). 

 

I can understand that support for 'arrays of arrays' is out of scope for Excel, but should it do more with what it has.  The idea that a vertical array of row arrays/ranges should map to a 2D array (with a specified fill characters for unused cells) would not appear to be that revolutionary; after all the implementations of H- and VSTACK that I have seen go much further.

 

The problem with the blank row is probably something that should be addressed within VSTACKλ but I wrote it in a hurry to replace Charles Williams's VSTACK function.  It would be perfectly reasonable to assume that, should an array be identifiable as 'null', it may be stacked to leave the second array unaltered.  Your test has some class, though a slightly 'techie' for a spreadsheet solution.

 

@lori_m Sorry, I meant to include you within the OP but couldn't locate your user id.

@Peter Bartholomew  The FastExcel xll you provided with the ACCUMULATE example seems to give me access to the other functions as well (intended or not??), but without the user interface bells and whistles. I started trying to replicate a bunch of those functions with LAMBDA and the new helpers (with the help of the online PDF user guide), but haven't gotten to his _STACK functions just yet; working through DIFF and sent me on a tangent to be able to handle some of its nuances more cleanly. 

 

As you say, it could be handled in STACK just as well as in the REDUCE LAMBDA call, but I think having the inclusive vs. exclusive qualities of REDUCE/SCAN like were in my pre-helper fold/reduce/scan versions would've been nice, though there are probably quite goods reasons why its not set up that way.

@tboulden 

The example I used was drawn from past work and refactored to provide an excuse to exercise the new helper functions, as announced by @Chris_Gross .  There might be justification to feed some ideas back but I didn't wish to be seen as trying to hijack a blog post, so I opened the new discussion.

 

I started with SCAN, basing it upon an amortisation schedule with occasional variation in interest rate

 

image.png

using the formula

= SCAN(principal, period#,
    LAMBDA(balance,p,
      LET(
        MPR, INDEX(rate#,p),
        remaining, 12*duration + 1-p,
        flow, PMT(MPR,remaining,balance),
        (1+MPR)*balance+flow
      )
    )
   )

The other formula of interest was the aggregation of interest over one year periods to present in the summary by year.

= BYROW(INDEX(interest#,SEQUENCE(duration,12)), LAMBDA(a,SUM(a)) )

That uses INDEX/SEQUENCE to convert the list to a crosstab and then the new BYROW to sum over the months of each year.

 

What I probably should feed back is the thought that the new helper functions address just about all of the frustrations I have had concerning Excel over the past years.  Brilliant!  Probably the thing that now comes to the fore is the need to revamp the user experience to make building, evaluating and debugging formulas a good experience.  Neither the definition of names not the formula bar are really fit for purpose any more.

 

That should probably be the subject for separate discussion though.

@Peter Bartholomew  Here's a proto-CORKSCREW attempt on your example; uses REDUCE and MAKEARRAY and some hard-coded values. Not sure how resource intensive these might be, but if you do alot of cashflows with same layout, it might suffice.

 

=REDUCE(
    {0,"","","",100000},
    SEQUENCE(5*12),
    LAMBDA(acc,val,
        LET(
            get_prior,LAMBDA(x,INDEX(acc,ROWS(acc),x)),
            MAKEARRAY(
                ROWS(acc)+1,
                COLUMNS(acc),
                LAMBDA(i,j,
                    IF(
                        i<=ROWS(acc),INDEX(acc,i,j),
                        LET(
                            period,get_prior(1)+1,
                            prior_bal,get_prior(5),
                            rate_,(1+XLOOKUP(1+QUOTIENT(+period, 12), {0;4}, {0.08;0.1}, ,-1))^(1/12)-1,
                            MPR,IF(period>1,get_prior(2)),
                            flow,PMT(MPR,60+1-period,prior_bal),
                            end_bal,(1+MPR)*prior_bal+flow,
                            pay,prior_bal*(1+rate_)-end_bal,
                            CHOOSE(j,period,rate_,pay,prior_bal*rate_,end_bal)
                        )
                    )
                )
            )
        )
    )
)

@tboulden 

I am not sure that represents a direction of travel that I will be following!  After all, I passed on Charles's CORKSCREW function and requested the more limited ACCUMULATE function.  Nevertheless, it opens my eyes to the possibilities.  It strikes me as pushing back the boundaries of the Excel formula as a program.

 

1. Data input is provided by full access to the workbook names

2. Multiple calculations are taken forward simultaneously

3. This is probably out of scope for SCAN so REDUCE and MAKEARRAY have been used instead     (does this imply that the first steps are repeated for every subsequent time period?)

4. Results are output to a contiguous region, multiple arrays being represented within one block.

 

Have I got this right, or are there things I have misunderstood even at this stage?

 

Something I have been considering is 'to what extent should I use Names for a hierarchy of Lambda functions in order to modularise the code?  I guess that is something you have done with your get_prior function though at first reading I simply read it as a variable name.  I think I have some catching up to do.  I have even got to decide on the objective; am I looking for the programming equivalent of the combined harvester or will a scythe and a ball of string do the job? 

 

 

 

 

@Peter Bartholomew

 

3. This is probably out of scope for SCAN so REDUCE and MAKEARRAY have been used instead (does this imply that the first steps are repeated for every subsequent time period?)

 

Based on what I've been able to determine, SCAN won't take an array as an init such that an array will be returned of more than one-dimension; I think BYROW/BYCOL/SCAN all have this limitation, but I'm not sure I understand why (other than "that's just how it is"); I've fiddled briefly with some of the "old-fashioned" coercion techniques to get them to produce larger arrays, but with no luck. As for the iterations, I'm assuming that the calcs are only performed for each new row and that the accumulator is just being referenced; I question whether the MAKEARRAY call for each new row is efficient, but I wonder if we'll get a native LAMBDA helper to replicate STACK, which is effectively what its doing here. I wonder if @Chris_Gross or some one could provide some insight on what's going on under the hood so we can know better what is efficient or will slow a workbook down once I've got a hundred different calls in one.

 

Have I got this right, or are there things I have misunderstood even at this stage? >> Sounds right to me!

 

I have even got to decide on the objective; am I looking for the programming equivalent of the combined harvester or will a scythe and a ball of string do the job >> I think it could be both? I know having simple LAMBDA access could make data-cleaning easy, even more so with the new helpers. I think we fall back to the efficiency argument to determine the scale of the implements needed; will these type of REDUCE/MAKEARRAY combinations bog things down, and how many is too many?

@tboulden 

My guess is that this is a positive decision to avoid situations that could potentially give rise to an 'array of arrays'.  That is fine if there are active plans to support such data structures natively (we can wait) but, if not, defaulting to a padded 2D array would be a far better solution than throwing an error.  Forcing you to use REDUCE / MAKEARRAY to achieve the obvious result is not satisfactory.

 

It isn't even as if such ideas are alien to Excel.

 

image.png

@Peter Bartholomew 

 

This reminded me of an issue I took with some LAMBDA formulations I was seeing on various fora that worked with ranges, but wouldn't necessarily work with arrays. The ":" operator here is acting as a VSTACK for ranges, so if it were extended to arrays, we could avoid MAKEARRAY entirely, I think.

 

@tboulden 

I was looking for a later post but, since this one turned up, your more recent strategy using Thunks appears to be relevant for the sorted list of lists problem.

= LET(
      THUNKλ,  LAMBDA(x, LAMBDA(x)),
      EXPANDλ, LAMBDA(ϑ, LAMBDA(r,c, INDEX(INDEX(ϑ,r,1)(),c))),
      m, ROWS(ListCountry#),
      n, MAX(COUNTIFS(StartList[COUNTRY],ListCountry#)),
      resultϑ, MAP(ListCountry#,
         LAMBDA(country, THUNKλ(FILTER(StartList[ATHLETE], StartList[COUNTRY]=country)))),
      display, MAKEARRAY(m,n, EXPANDλ(resultϑ)),
      IFERROR(display,""))

I have a nasty feeling that such techniques are not for the fainthearted!  Teaching the approach could be challenging as well.

Sorry for the delay! I like this approach, had to refresh my memory on what territory we traversed in this thread.

I agree about teaching to others, have been conversing with someone on a forum asking "why use thunks?" and asking for simple examples. I may have to show them this as it may be more accessible way of demonstrating. Hope 2022 has started off well for you!

@tboulden 

It was the fact that I do not find picking up on some of our past discussions to be a trivial exercise that made me think of how one might train others.  The challenge might be almost as significant as setting out to move an Excel user to M or DAX.  Whilst one may argue that Lambda functions are simple extensions of the traditional spreadsheet methods to include user-defined functions, the process of solution development shifts from the ad-hoc to being a programming exercise (by way of contrast, traditional methods allow users to interact with their numbers whilst remaining in denial that they are actually programming).

 

I am happy that you should modify or use the example as you see fit and I would be interested to know whether you convince your forum member.

1 best response

Accepted Solutions
best response confirmed by Peter Bartholomew (Silver Contributor)
Solution

@Peter BartholomewJust to get rid of that pesky blank row:

 

=REDUCE(
    "",
    ListCountry#,
    LAMBDA(acc,val,
        IF(
            TYPE(acc)<>64,TRANSPOSE(FILTER(StartList[ATHLETE],StartList[COUNTRY]=val)),
             VSTACKλ(,acc,TRANSPOSE(FILTER(StartList[ATHLETE],StartList[COUNTRY]=val)))
        )
    )
)

 

I think since MAP will iterate over an array of any shape, the results have to fit back into the same shape; I'd be interested to know if this is a mistaken assumption on my part.

View solution in original post