SOLVED

LAMBDA Halloween Challenge

Silver Contributor

@Peter Bartholomew @Sergei Baklan @mtarler 

Goal: generate an array of all possible combinations (w/repetitions) from a list of items.

 

For example, if there are 6 total items with 4 chosen (allowing for repeats), there are 126 possible combinations as evident from:

 

 

=COMBINA(6,4)

 

 

 

The challenge is to generate an array of all these possible combinations.  My solution is in the attached workbook, but I won't explain it yet, so I don't interfere with anyone's creative process.  The most combinations I'm able to generate is 3,003. There seems to be a limit or two preventing me from scaling this setup even bigger.

 

Patrick2788_0-1666984952724.png

 

19 Replies

@Patrick2788 

This is somewhat lacking in mathematical refinement.  I generated the permutations and then sorted and removed duplicates.  I don't know what the size limitations might be.

This is a clever solution. I'm still unpacking your formula and the use of BASE to generate the numbering. I was able to generate 3,003 combinations. It seems anything beyond that is something SEQUENCE won't calculate.

@Peter Bartholomew 

Your solution is very efficient.  Both of our approaches are similar in that goal is to create the 126x4 matrix (6 items, pick 4 in the example) of numbers.

 

I like how you don't generate much excess at all array-wise, so there's no need to FILTER.  I'll need to brush up on BASE. Tip of the cap there!

 

Essentially, my approach is to generate an array of numbers from 1111 to 6666 (6 being the total number of items in the example).

=SEQUENCE(REPT(COUNTA(Items),pick)-REPT(1,pick)+1,,REPT(1,pick))

Removed any numbers containing 0, 7, 8, and 9 because there's only 6 items and 0 is never selected.

=UNIQUE(BYROW(Arr,LAMBDA(row,IF(MAX(IFERROR(SEARCH(Exclude,row),0))=0,row,""))),,1)

Converting scalar to array, sorting, joining, splitting, wrapping, etc.

=WRAPROWS(TEXTSPLIT(TEXTJOIN(",",,UNIQUE(BYROW(ExArray,LAMBDA(row,TEXTJOIN(",",,SORT(MID(row,SEQUENCE(,pick),1),,,1)))))),,","),pick)*1

 From there it was a matter of a lookup.


I'm going tinker with both solutions to see if I can get the combinations to exceed 3,003.  There seems to be a limit with SEQUENCE that's preventing expanding the item and pick # beyond about 10 items or so.

 

By itself, this is as far as I can get SEQUENCE to calculate:

=SEQUENCE(1000000,53)

@Patrick2788 

I have had a further go at improving the efficiency of the calculation.  However, the core idea is still to generate integers strings corresponding to the permutations and then filter out the repetitions.  This time, rather than sorting and using UNIQUE, I have examined each number to check whether it is already correctly sorted left to right.   That allows me to use FILTER.  The limit I hit on calculation size is that the number of permutations must not exceed 2²⁰.  I have attempted to use full array formulas rather than Lambda helper functions where possible.  An example of that is in FilterDecrλ where I compare entire columns of the array rather than working row by row.

 

It must be possible to implement a counter that that replaces any rightmost 0s by the newly incremented digit to their left (creating a jump in the series) but the logic is likely to be more contorted.

 

@Peter Bartholomew 

Re: permutation limit
Yes, it seems 9 items, pick 6 (about 500,000 permutations) is a struggle to calculate. My solution calculates half the time while yours holds up a bit more than that. There were a few instances where Excel wouldn't update the elements in the array.

 

It must be possible to implement a counter that that replaces any rightmost 0s by the newly incremented digit to their left (creating a jump in the series) but the logic is likely to be more contorted.

This seems to be the case. I've been messing around with some seldom used engineering functions like BITLSHIFT/BITRSHIFT but can only get so far with them.  Even if I could get something to work there's the issue of using older functions that weren't intended to be used with arrays, so I believe there's some "lifting" being done - not optimal.

 

best response confirmed by Patrick2788 (Silver Contributor)
Solution

@Patrick2788 

Finally, I have a solution!

This time I used a Lambda function to add 1 to the previous index (stored in base n) but before returning the value, I scanned the digits and copied the larger digit across.

WorksheetFormula
= LET(
      count, COUNTA(Items),
      initialise, REPT("0",pick),
      indices, VSTACK(
               initialise, 
               SCAN(initialise, SEQUENCE(combinations-1), NextCombinationλ(count,pick))
               ),
      INDEX(Items, 1 + Explodeλ(indices))
   )

where

NextCombinationλ
= LAMBDA(n,p, 
     LAMBDA(initial,s,
        LET(
            place,     n^SEQUENCE(1,p,p-1,-1),
            increment, 1 + SUM(place * Explodeλ(initial)),
            converted, BASE(increment,n,p),
            vector,    Explodeλ(converted),
            modified,  SCAN( , vector, Maxλ),
            CONCAT(modified)
        )
     )
  );

supported by

Explodeλ = LAMBDA(term, MID(term, SEQUENCE(1,pick), 1));

Maxλ = LAMBDA(u, v, Max(u,v));

I have just performed the calculation for 8 picks taken for 9 items in 300ms.  Ther are 12,870 combinations and the permutations would have been over 43M!

 

@Peter Bartholomew 

Impressive! I'm still studying how you pulled this off.  When I thought of this experiment several weeks ago, I began by studying the occurrences for each item in each column looking for patterns in the numbers.

Patrick2788_0-1667221805464.png

This is the distribution of items in each column.  There's some symmetry but probably not enough to go on that would top anything you've created.  It's interesting that when UNIQUE is used on the above matrix, the occurrences for each number is 2 with the exception of 2 numbers appearing 4 times.

@Patrick2788 

The key line is

     modified,  SCAN( , vector, Maxλ),

which scans a row of indices to ensure each sequence increases monotonically.  That allows the algorithm to make massive strides forward to the next possible combination rather than simply incrementing through the permutations and testing the outturn.

@Peter Bartholomew @Patrick2788  Ok so I will submit mine but I think it is very similar to Peter's (but I have some questions for him...):

=LET(b, COUNTA(Items), combN, COMBINA(b,pick), n, SEQUENCE(combN,,0),
series, SCAN(,n,LAMBDA(p,i, CONCAT(SCAN(0, MID(BASE(DECIMAL(p,b)+1,b,pick),SEQUENCE(pick),1),LAMBDA(pp,ii,IF(pp>ii,pp,ii)))))),
INDEX(Items,DECIMAL(MID(series,SEQUENCE(1,pick),1),b)+1))

Peter's is much more readable than mine but I still get lost.  In mine I create a list of possible options using base() which is the same from what I can tell.  I build my list by a manual sequential adding 1 and then make sure it isn't less then the index before it.  In looking back it appears you guy originally did a filtering but I think this is how Peter's new version works also.  My series is built on line 2 and proud to say this is the first time I found an advantage to doing a SCAN() without the initialize value to force it to use the first value in the sequence (zero in this case).

Since it is hard to read in the middle is BASE(DECIMAL()+1) which converts the text to a number, then +1 and back to the text.  then outside that is a MID() to break it into characters and those characters are sent through the SCAN with the IF(>,,).  I use the IF(>,,) instead of MAX because I don't bother converting back to decimal until the end so the IF can handle the ascii.

maybe peter can run the speed test on this and see how it compares.

@Peter Bartholomew Ok now for my questions to Peter...
you have a line:

        SCAN(initialise, SEQUENCE(), NextCombinationλ(count,pick))
then in Next Combination you have:
= LAMBDA(n,p,
        LAMBDA(initial,s,

so does the (count,pick) of the lambda caller get assigned to the outer Lambda into n,p and then the SCAN variables get passed to the inner Lambda as initial, s ????

This is a very interesting structure to me that I haven't seen/played with.

If i understand correctly your 'place' and 'increment' variables are basically reconstructing the DECIMAL value? or is there something more?

I'm really perplexed my your Max lambda using MAX() as it appears those are (potentially) ascii characters at that point.  did you try it when you exceed 9?

Thank you both, the exercise and diving into the solutions has been very educational. 

@mtarler 

I was able to extend yours to 14 items, pick 9, 20,661,046,784 permutations.  Excel wanted no part of 15 items!

 

Patrick2788_0-1667249641335.png

 

I'd be interested in knowing the timing of the solutions vs. having a 'cheat sheet' of all the item positions (as constants) and simply looking up each element to obtain the item name.

@mtarler 

Yes, you are correct that I used 'place' and 'increment' in lieu of DECIMAL.  Sadly, I had not come across DECIMAL though I assume it was introduced at the same time as BASE!  :( 

The function definition

 

NextCombinationλ
= LAMBDA(n,p, LAMBDA(initial,s,
        LET(          
            increment, 1 + DECIMAL(initial,n),
            converted, BASE(increment,n,p),
            vector,    Explodeλ(converted),
            modified,  SCAN( , vector, Maxλ),
            CONCAT(modified)
        )
  ));

 

now reads somewhat better.  Your understanding of the parameter strings for 'NextCombinationλ' is also correct.  It would be possible to take the process further and provide each of the parameters individually, expressing the function in its 'Curried' form.  Such a form takes the parameters one at a time from left to right.  The function is only evaluated when the final parameter value is provided.  In this case, I wanted to provide 'n' and 'p' explicitly within a parameter string but allow the function to accept the remaining parameters from the Lambda helper function.

 

At the time I was developing the formula, I did not think going beyond 9 items was going to be a practicality, so I did not consider the possibility that 'A' and 'B' were going to appear as 'digits'!  I guess redefining

 

Maxλ = LAMBDA(u, v, IF(u>v, u, v));

 

would be a start.  My search for 'readability' above all, has taken me down a path of long variable names, avoiding nested formulas, and placing each statement on a new line.  I even asked for an option to use ':=' as an assignment operator in place of alternate comma separators.

 

The penalty is that my formulas are somewhat idiosyncratic and sometimes the natural language readability can mask the underlying syntax.

 

ah thank you. it is good to know how the lambda callers nest that way.
Now i think our codes are essentially identical except i believe in your main lines 4-8 could be reduced to just:
indices, SCAN( ,SEQUENCE(....
as the 0000 value will be created by default (at least it appeared to work for me).

@Patrick2788 @mtarler 

It has just occurred to me that I stopped my 'Lambda-isation' one step short of a 'perfect' solution.

The principle (which might be better discussed on @Jan Karel Pieterse 's Lambda discussion LI site) is

"Every spreadsheet formula should be expressed as a Lambda function"! 

The name chosen for the function should provide a succinct statement of what the function does, and the parameters provide an inclusive list of the formula's immediate precedents.

 

There are assumptions implicit in this statement.  For example, it assumes that the user needs a clear idea of what the formula does but has no need to understand how it does it (the developer and auditor's needs are somewhat different and in the world of spreadsheets the user often adopts multiple roles).

 

WorksheetFormula
= CombinationsAλ(Items, pick)

CombinationsAλ
= LAMBDA(Items, p,
    LET(
        count,       COUNTA(Items),
        initialise,  REPT("0",p),
        counter,     SEQUENCE(COMBINA(count,pick)-1),
        indices,     VSTACK(initialise, 
                     SCAN(initialise, counter, NextCombinationλ(count,p))),
        INDEX(Items, 1 + Explodeλ(indices))
    )
 );

 

Matt.  I tried removing 'initialise', VSTACK and the use of initialise within SCAN but, this far, without achieving the perfect result!

 

Patrick. I have included a formula for your occurrence matrix, but the nested array problem makes it ugly.  I should rework it as a Scanλ(…, Countifsλ) that uses REDUCE and HSTACK to emulate the intended SCAN while passing the COUNTIFS as a Lambda variable.  I have done it before but none of this is a trivial exercise and it is annoying that such workarounds are needed.

Sorry

= MAKEARRAY(COUNTA(Items),pick,
      LAMBDA(r,c, 
          COUNTIFS(INDEX(result#,,c), INDEX(Items,r))
      )
  )

would have been better in this case.

@Patrick2788 

 

I 'cleaned up' my original solution to remove SEARCH:

 

=WRAPROWS(TEXTSPLIT(TEXTJOIN("|",,DROP(SORT(UNIQUE(MAP(Arr,LAMBDA(e,IF(OR(MIN(Burst(e))=0,MAX(Burst(e))>COUNTA(Items)),"",TEXTJOIN("|",,SORT(Burst(e),,,1))))))),1)),,"|"),pick)*1

 Burst (I like 'Explode' better but I feel I'd be stealing Peter's name) being:

=LAMBDA(scalar,1*MID(scalar,SEQUENCE(,LEN(scalar)),1))

 

It caps out at 9 items, pick 5.  It doesn't matter what I use on the final matrix to convert numbers to names: MAP/XLOOKUP, MAP/CHOOSECOLS, SCAN/CHOOSECOLS, etc. Won't go beyond 9/5 because it has to look up each element.

 

As aside to this project.  Wouldn't it be nice if I could do this ('Explode' a scalar):

Patrick2788_0-1667310800054.png

With this:

=TEXTSPLIT(D1,"")

The above produces a #VALUE! error.  Being able to explode a scalar with TEXTSPLIT would be really convenient.

 

 

@Patrick2788 

By all means use Explode!  I think I adopted from a video demonstration by Jack Williams of Microsoft Research, so I have no claim to the name!

 

I agree that the 'null separator' should be interpreted as 'split at every character boundary'.  Not that it would do much good at present due to the fact that 

= TEXTSPLIT(list,",")

only returns the first term of each member of the list.  The justification is that Excel has never handled nested arrays so, for the sake of backward compatibility, it should do the same now!  I have never subscribed to the view that it is a good idea to compromise one's future to maintain backward compatibility to a steaming pile of junk!  In any case

= MID(list,{1,3,5,7},1)

always has worked.  Although

= TEXTSPLIT(
      TEXTJOIN(";",,list),
      ",",";"
  )

works, one runs into parameter character size problems, as I believe you have demonstrated.

 

@mtarler 

@Patrick2788 

Matt, thanks to your introduction to the DECIMAL function, I was able to put together a reasonably elegant solution to a LinkedIn problem of calculating a check digit for a shipping container code!  Probably the first time I have needed base 36 too!

(2) Post | Feed | LinkedIn

You might also find the dynamic documentation sheet entertaining.

Container - check digit.jpg

@Patrick2788 You can try the following approach:

 

=LET(COMBINA_SETS, LAMBDA(x,n,LET(
 NEXT_SET, LAMBDA(ME,x,m,n,i,IF(i=0,x,LET(s,SEQUENCE(,n),y,x+1,
   idx, XMATCH(m,IF(i < n,IF(s>i,m+1,y),y),-1,-1),
   ME(ME,IF(s=i,INDEX(y,idx),x),m,n,IF(idx=i,0,i-1))))),
 y,TOCOL(x),m,ROWS(y),IF(AND(n=1,m=1),x, LET(cnts, 
  SEQUENCE(COMBINA(m,n)-1),idx,REDUCE(SEQUENCE(,n,1,0),cnts,LAMBDA(ac,i,
  VSTACK(ac,NEXT_SET(NEXT_SET,IF(i=1,ac,TAKE(ac,-1)),m,n,n)))),
  INDEX(y,idx))))), 
  COMBINA_SETS(B1:B6,D1))

 

It uses a user LAMBDA function COMBINA_SETS, which relays on a recursive user LAMBDA function NEXT_SET. To avoid creating a function in the Name Manager, I used the following trick: LAMBDA Formulaic Recursion: It’s All About ME! but you can remove the ME argument and define it in the Name Manager if you want (this a trick to use a recursive function inside of LET). I prefer this way because the sole purpose is to be used in COMBINA_SETS. If you enter a given set of values it returns the next one. The rest is just to use REDUCE to generate all the combinations based on NEXT_SET. It doesn't rely on Excel binary functions, since all of them have limitations on the number of digits of the binary number (DEC2BIN and BASE).

 

As per the shared link: The current operand stack limit in Excel is 1,024.  This should be borne in mind together with calculation times, as the current recursion limit is set as 1,024 divided by (number of lambda parameters + 1)

 

Better to reduce the input argument list in NEXT_SET, so it can be formulated like this:

 

=LET(COMBINA_SETS, LAMBDA(x,n,LET(y,TOCOL(x), m,ROWS(y),
 NEXT_SET, LAMBDA(ME,x,i,IF(i=0,x,LET(s,SEQUENCE(,n),y, x+1,
  idx,XMATCH(m,IF(i<n,IF(s>i,m+1,y),y),-1,-1),
  ME(ME,IF(s=i,INDEX(y,idx),x),IF(idx=i,0,i-1))))),
 IF(AND(n=1,m=1),x,LET(cnts,SEQUENCE(COMBINA(m,n)-1), 
  idx,REDUCE(SEQUENCE(,n,1,0),cnts,LAMBDA(ac,i,
  VSTACK(ac,NEXT_SET(NEXT_SET,IF(i=1,ac,TAKE(ac,-1)),n)))),
  INDEX(y,idx))))),
 COMBINA_SETS(B1:B6,D1))

 

so the limit would be 256 digits (max pick) in the worst-case scenario. Which is bigger than the BASE function limit of 53 digits. 

 

For example, for the input data: {1,2,3,4,5,6} to find the combinations of three elements 

 

1   1   1
1   1   2 <- NEXT_SET(NEXT_SET,A1:C1,6,3,3)
1   1   3 <- NEXT_SET(NEXT_SET,A2:C2,6,3,3)
1   1   4 <- NEXT_SET(NEXT_SET,A3:C3,6,3,3)
1   1   5 <- NEXT_SET(NEXT_SET,A4:C4,6,3,3)
1   1   6 <- NEXT_SET(NEXT_SET,A5:C5,6,3,3)
1   2   2 <- NEXT_SET(NEXT_SET,A6:C6,6,3,3)
...
6   6   6 <- NEXT_SET(NEXT_SET,A56:C56,6,3,3)

 

The algorithm goes backward, from right to left doing the corresponding increments, if the maximum value is reached, then it goes to the next digit recursively doing the same. It follows the same intuitive process a human will do to generate all combinations.

 

Here is the output:

davidleal_0-1680321591525.png

Not showing the entire list on the screen.

 

This solution doesn't generate a set that then requires removing duplicates, etc. it generates exactly the combinations required. It considers also special cases such as m=n, where we don't need a recursion process. It also works in case the input array is column-wise instead of row-wise.

 

1 best response

Accepted Solutions
best response confirmed by Patrick2788 (Silver Contributor)
Solution

@Patrick2788 

Finally, I have a solution!

This time I used a Lambda function to add 1 to the previous index (stored in base n) but before returning the value, I scanned the digits and copied the larger digit across.

WorksheetFormula
= LET(
      count, COUNTA(Items),
      initialise, REPT("0",pick),
      indices, VSTACK(
               initialise, 
               SCAN(initialise, SEQUENCE(combinations-1), NextCombinationλ(count,pick))
               ),
      INDEX(Items, 1 + Explodeλ(indices))
   )

where

NextCombinationλ
= LAMBDA(n,p, 
     LAMBDA(initial,s,
        LET(
            place,     n^SEQUENCE(1,p,p-1,-1),
            increment, 1 + SUM(place * Explodeλ(initial)),
            converted, BASE(increment,n,p),
            vector,    Explodeλ(converted),
            modified,  SCAN( , vector, Maxλ),
            CONCAT(modified)
        )
     )
  );

supported by

Explodeλ = LAMBDA(term, MID(term, SEQUENCE(1,pick), 1));

Maxλ = LAMBDA(u, v, Max(u,v));

I have just performed the calculation for 8 picks taken for 9 items in 300ms.  Ther are 12,870 combinations and the permutations would have been over 43M!

 

View solution in original post