SOLVED

Solving an 0 1 Knapsack Problem with LAMBDA

Silver Contributor

A Knapsack Problem is defined as:

Given a set of items, each with a weight and a value, determine which items to include in the collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. -Wikipedia

Patrick2788_0-1680113878980.png

 

After studying 0 1 Knapsack Problems (A given item may only be selected once) and the techniques used to solve them, I sought to build formulas to solve a standard problem consisting of 5 items with various weights and benefits.

 

I considered generating a value matrix (array of arrays limitation - MAKEARRAY could only go so far) and creating a recursive Lambda (I couldn't conceptualize one and I didn't think it would be efficient memory-wise). 

 

The solution

I went back to a previous project's technique:

LAMBDA Halloween Challenge - Microsoft Community Hub

 

That challenge was all about generating the COMBINA results.  The way I see an 0 1 Knapsack is essentially generating COMBIN results and then determining which combination is the greatest benefit without going over the capacity.

 

First, I created the 'Shuffle' function which takes a value produced from SEQUENCE and excludes any numbers >5 (total number of items) and removes values where a number repeats (e.g. 122 and 123 vs 321).

 

Shuffle
=LAMBDA(scalar,LET(
    arr, 1 * MID(scalar, SEQUENCE(, LEN(scalar)), 1),
    filtered, FILTER(arr, arr <= 5, 0),
    sorted, SORT(UNIQUE(filtered, 1), , , 1),
    IF(TAKE(sorted, , 1) = 0, 0, sorted)
))

 

Then I created a function to be called within REDUCE:

 

GenerateCombin
=LAMBDA(a,v,VSTACK(a, EXPAND(Shuffle(v), , 5, 0)))

 

This is the most resource intensive part of the project because if SEQUENCE goes into 5 digits, REDUCE will struggle to stack.  I added a check to determine if picking all 5 items exceeds the limit so there's no need to go into 5 digits with SEQUENCE!

 

=LET(
    r, IF(SUM(weight) > cap, 2345, 12345),
    distinct, UNIQUE(DROP(REDUCE(0, SEQUENCE(r), GenerateCombin), 1)),
    DROP(SORT(distinct), 1)
)

 

'Solve' to be called within BYROW.  This function checks the CombinMatrix and totals up the Benefit and Weight for each combination.  It strings together the results in each cell.

 

 

=LAMBDA(row,LET(
    w, SUM(XLOOKUP(row, item, weight, 0)),
    b, SUM(XLOOKUP(row, item, benefit, 0)),
    IF(
        w <= cap,
        "Items: " & TEXTJOIN(", ", , TEXT(row, "0;;;")) & CHAR(10) & "Benefit: " & b & CHAR(10) &
            "Weight: " & w,
        NA()
    )
))

 

 

At the sheet level:

 

=LET(results,TOCOL(BYROW(CombinMatrix,Solve),2),WRAPROWS(results,4,""))

 

 The results:

Patrick2788_1-1680115957731.png

 

The question for the community:  Can you conceptualize an easier way to do this?  Formula-based solutions only, please.

 

13 Replies

@Patrick2788  So I went with the recursive call option:

 

 

 

MaxPac = LAMBDA(items, benefits, weights, limit, [hide_text], LET(
    i, IF(ROWS(items) = 1, TRANSPOSE(items), items),
    b, IF(ROWS(benefits) = 1, TRANSPOSE(benefits), benefits),
    w, IF(ROWS(weights) = 1, TRANSPOSE(weights), weights),
    list, REDUCE(0, SEQUENCE(ROWS(i)), LAMBDA(q, p,
             IF(INDEX(w, p) <= limit,
               VSTACK(q, IF(ROWS(i) > p,
                  LET(sublist, MaxPac(DROP(i, p),DROP(b, p),DROP(w, p),limit - INDEX(w, p)),
                      IF( IFERROR(INDEX(sublist, 1), 0),
                         HSTACK(INDEX(b, p), INDEX(i, p)),
                         HSTACK(INDEX(b, p) + INDEX(sublist, 1, 1),
                                INDEX(i, p),
                                DROP(sublist, , 1))
                         )),
                  HSTACK(INDEX(b, p), INDEX(i, p))
                  )),
            q))),
    out, INDEX(SORT(list, 1, -1), 1, ),
    hideText, IF(ISOMITTED(hide_text), FALSE, hide_text),
    IF( hideText, out,
        CONCAT("Benefit: ", INDEX(out, 1), "  Items: ",
            IFERROR(TEXTJOIN(",", 1, DROP(out, , 1)), 0)
        )
    )
))

 

 

 

It looks complicated but in concept it is easy.  I check loop using REDUCE and check if that element is less than the weight limit, if so then I 'add' it to the list and re-call itself with the remainder of the elements.  Each time I add a row and each row sums the Benefit in the first element.  The last step is to sort by the Benefit and only Take/Return the highest value and pass that up the chain.  There are a number of extra IFs and such because of making sure I don't strip off all the elements and such and in the beginning I check if the array is in a row or a column and swap it to make then columns if needed.

EDIT: I found some bug where the subcall must return an error or something in certain cases and I added an IFERROR() to the check.  I also added some text/formatting that will be added only on the top call if that parameter is missing or set to false.

So I found this to be significantly faster in cases when all 5 objects are selected (i.e. the weights are all set below the total limit).  I  can only imagine that additional item list would also result in time improvements also.

On final note, this solution intentionally did not attempt to return all potential best solutions and just returns a best solution.  In another words if there are more than 1 solution that returns the same maximum benefit, this solution will just return one of them.

@mtarler 

Your solution seems to be the equivalent to the VBA solutions I found in research where 5 levels of nested loops were being used.  To your credit, I think your solution is a lot more accurate.  I played around with stepping through vba solutions line-by-line because I was wondering about the order the checks were being done in.  I bagged the vba stuff when a few of the solutions returned 0 for an answer!

 

Writing recursive functions is tricky to me because one has to write with a sense of anticipation in mind. If I understand your solution, initiallly the checks on number of rows with Items, Weights, and Benefits is done because the array must be checked after the function has been run to see if the TRANSPOSE is needed.  It's obvious the number of rows is 1 on the first pass but not on subsequent runs.

Timings

4-digits
0.3086568
0.3444158
0.3086526
0.3095298
0.3091404
So you're interested solely in solutions for the 5-item case? If not, what's an upper bound on the number of items? These types of problems which must consider an exponentially increasing number of permutations can quickly become too much for Excel.

Regards
I'm considering 5 at present. A previous discussion re:COMBINA ran into some limitations with setting upper bound too high.
So I do that rows check because my brain and such was working that way when I wrote the first draft of the function and it failed because it only saw 1 row on the input so instead of trying to re-think it all I just converted horizontal data to vertical data.
The nice thing about the recursion is that it will work for as many levels as Excel can handle (i.e. it doesn't have 5 fixed loops). The nice thing about my solution only looking for ONE max result (and not all solutions possible for the max result) is that it will go 'down' and 'up' the recursion ladder and keep returning 1 value after each tier. Actually thinking now, technically for that matter, even if I returned 'all' options it would still be 1 2-d array because of the REDUCE, but then I would have to 'add' the next up tier to every returned row ...
best response confirmed by Patrick2788 (Silver Contributor)
Solution

@Patrick2788 

 

I think I'd prefer to avoid any recursion or LAMBDAs and resort to some good old-fashioned matrix multiplication!

 

=LET(
    α,benefit,
    β,weight,
    γ,COUNT(α),
    δ,MOD(INT(SEQUENCE(2^γ,,0)/2^SEQUENCE(,γ,0)),2),
    ε,MMULT(δ,TOCOL(α)),
    ζ,MMULT(δ,TOCOL(β)),
    η,ζ<=cap,
    TEXTJOIN(",",,
        FILTER(item,INDEX(δ,MATCH(MAX(FILTER(ε,η)),IF(η,ε),0)),0)
    )
)

 

No doubt this could be refined, though the basic outline is there.

VERY nice. that delta variable that creates the full combination matrix is what we've been missing! I modified the file and formulas to do some testing and went out to 20 items!!! This version out performed the recursion by more than 2.5:1

 

mtarler_1-1680217161698.png

I also have to look at the recursion because I'm getting some errors and incorrect values from it ... (or maybe just throw it away since this solution is better in each way)

btw here is the updated formula where I break the last line up to store the "mask" from delta and then can easily apply it to weights, and items

=IF(D10,
LET(
    to,NOW(),
    α,benefit,
    β,weight,
    γ,COUNT(α),
    δ,MOD(INT(SEQUENCE(2^γ,,0)/2^SEQUENCE(,γ,0)),2),
    ε,MMULT(δ,TOCOL(α)),
    ζ,MMULT(δ,TOCOL(β)),
    η,ζ<=cap,
    benefitOut, MAX(FILTER(ε,η)),
    ansMask,INDEX(δ,MATCH(benefitOut,IF(η,ε),0),),
    itemOut,FILTER(item,ansMask),
    weightOut, SUM(FILTER(β, ansMask )),
    out,HSTACK((NOW()-to)*24*3600, "Benefit: " & benefitOut, "Weight: " & weightOut, "Items: " & TEXTJOIN(",",,itemOut )),
    out
    ),
"not run")

I also have my poor-mans-timer and a 'toggle' to easily turn the calculation on/off

 

@Patrick2788 

@JosWoolley 's method of generating the combination matrix is fascinating for how efficient of a solution it is for a complex problem. From what I gather, it's based somewhat in binary (putting the 0 1 in 0 1 Knapsack!).

 

This produces the combination matrix before MMULT gets to it:

 

=MOD(INT(SEQUENCE(2^5,,0)/2^SEQUENCE(,5,0)),2)

 

 

I did some playing around with engineering functions.

 

Formula in K:

 

=BIN2DEC(I2)

 

Formula in N:

 

=DEC2BIN(K2,5)

 

 

Patrick2788_0-1680269748287.png

 

 

 

 

 

This is an interesting approach, but you have a limitation with binary numbers, the maximum number of digits is 10. Other than that it is great for generating the sequence and then splitting it:

 

=MID(DEC2BIN(SEQUENCE(2^5-1),5),SEQUENCE(,5),1)

 

 

@davidleal 

I've seen some workarounds using concatenation to get around that 10 digit limit.

 

 

Here's my second take on Knapsack.

 

CombinMatrix
=LET(
    n, COUNT(item),
    combinations, SUM(COMBIN(n, SEQUENCE(n))),
    MAKEARRAY(combinations, n, LAMBDA(r, c, 1 * MID(DEC2BIN(r, 5), c, 1)))
)

Sheet level formula:
=LET(
    filtered, FILTER(CombinMatrix, MMULT(CombinMatrix, TOCOL(weight)) <= cap),
    b, MMULT(filtered, TOCOL(benefit)),
    stack, HSTACK(filtered, b),
    TAKE(DROP(SORT(stack, COUNT(item) + 1, -1), , -1), 1)
)

 

I've added an Include line:

Patrick2788_0-1680281901932.png

 

@Patrick2788 

I nearly failed the test by forgetting to include a Lambda function!

image.png

Identifying the combinations was done through the application of binary numbers (not as Excel text though)

Combinationλ(n)
= LAMBDA(n,
    LET(
      k, SEQUENCE(1, 2 ^ n), 
      p, 2 ^ SEQUENCE(n, 1, 0), 
      SIGN(BITAND(k, p))
    )
  )

 

I've taken a look at BITAND a few times but didn't see a practical use for it so I didn't spend much time on it. The way it's used here makes sense in identifying where the bits line up and then using SIGN to get 1s and 0s. Both functions probably not originally designed with arrays in mind but capable of doing some lifting. It's good to have options!

A similar problem to the Knapsack is the Cutting Stock Problem:
https://en.wikipedia.org/wiki/Cutting_stock_problem

I see one of the methods for solving the problem is generating all possible combinations. The problem with using this method in Excel is the row limit/memory concerns. The Cutting Stock Problem seems to require the provided data be unstacked before combinations can be generated. However, generating all possible combinations for 200 items, pick 3 doesn't seem do-able. Other approaches involve first solving the problem as best as possible and then re-checking the values to make improvements where needed. Sounds like lots of recursion!

1 best response

Accepted Solutions
best response confirmed by Patrick2788 (Silver Contributor)
Solution

@Patrick2788 

 

I think I'd prefer to avoid any recursion or LAMBDAs and resort to some good old-fashioned matrix multiplication!

 

=LET(
    α,benefit,
    β,weight,
    γ,COUNT(α),
    δ,MOD(INT(SEQUENCE(2^γ,,0)/2^SEQUENCE(,γ,0)),2),
    ε,MMULT(δ,TOCOL(α)),
    ζ,MMULT(δ,TOCOL(β)),
    η,ζ<=cap,
    TEXTJOIN(",",,
        FILTER(item,INDEX(δ,MATCH(MAX(FILTER(ε,η)),IF(η,ε),0)),0)
    )
)

 

No doubt this could be refined, though the basic outline is there.

View solution in original post