Mar 29 2023 12:04 PM
A Knapsack Problem is defined as:
Given a set of items, each with a weight and a value, determine which items to include in the collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. -Wikipedia
After studying 0 1 Knapsack Problems (A given item may only be selected once) and the techniques used to solve them, I sought to build formulas to solve a standard problem consisting of 5 items with various weights and benefits.
I considered generating a value matrix (array of arrays limitation - MAKEARRAY could only go so far) and creating a recursive Lambda (I couldn't conceptualize one and I didn't think it would be efficient memory-wise).
The solution
I went back to a previous project's technique:
LAMBDA Halloween Challenge - Microsoft Community Hub
That challenge was all about generating the COMBINA results. The way I see an 0 1 Knapsack is essentially generating COMBIN results and then determining which combination is the greatest benefit without going over the capacity.
First, I created the 'Shuffle' function which takes a value produced from SEQUENCE and excludes any numbers >5 (total number of items) and removes values where a number repeats (e.g. 122 and 123 vs 321).
Shuffle
=LAMBDA(scalar,LET(
arr, 1 * MID(scalar, SEQUENCE(, LEN(scalar)), 1),
filtered, FILTER(arr, arr <= 5, 0),
sorted, SORT(UNIQUE(filtered, 1), , , 1),
IF(TAKE(sorted, , 1) = 0, 0, sorted)
))
Then I created a function to be called within REDUCE:
GenerateCombin
=LAMBDA(a,v,VSTACK(a, EXPAND(Shuffle(v), , 5, 0)))
This is the most resource intensive part of the project because if SEQUENCE goes into 5 digits, REDUCE will struggle to stack. I added a check to determine if picking all 5 items exceeds the limit so there's no need to go into 5 digits with SEQUENCE!
=LET(
r, IF(SUM(weight) > cap, 2345, 12345),
distinct, UNIQUE(DROP(REDUCE(0, SEQUENCE(r), GenerateCombin), 1)),
DROP(SORT(distinct), 1)
)
'Solve' to be called within BYROW. This function checks the CombinMatrix and totals up the Benefit and Weight for each combination. It strings together the results in each cell.
=LAMBDA(row,LET(
w, SUM(XLOOKUP(row, item, weight, 0)),
b, SUM(XLOOKUP(row, item, benefit, 0)),
IF(
w <= cap,
"Items: " & TEXTJOIN(", ", , TEXT(row, "0;;;")) & CHAR(10) & "Benefit: " & b & CHAR(10) &
"Weight: " & w,
NA()
)
))
At the sheet level:
=LET(results,TOCOL(BYROW(CombinMatrix,Solve),2),WRAPROWS(results,4,""))
The results:
The question for the community: Can you conceptualize an easier way to do this? Formula-based solutions only, please.
Mar 29 2023 07:31 PM - edited Mar 29 2023 09:16 PM
@Patrick2788 So I went with the recursive call option:
MaxPac = LAMBDA(items, benefits, weights, limit, [hide_text], LET(
i, IF(ROWS(items) = 1, TRANSPOSE(items), items),
b, IF(ROWS(benefits) = 1, TRANSPOSE(benefits), benefits),
w, IF(ROWS(weights) = 1, TRANSPOSE(weights), weights),
list, REDUCE(0, SEQUENCE(ROWS(i)), LAMBDA(q, p,
IF(INDEX(w, p) <= limit,
VSTACK(q, IF(ROWS(i) > p,
LET(sublist, MaxPac(DROP(i, p),DROP(b, p),DROP(w, p),limit - INDEX(w, p)),
IF( IFERROR(INDEX(sublist, 1), 0),
HSTACK(INDEX(b, p), INDEX(i, p)),
HSTACK(INDEX(b, p) + INDEX(sublist, 1, 1),
INDEX(i, p),
DROP(sublist, , 1))
)),
HSTACK(INDEX(b, p), INDEX(i, p))
)),
q))),
out, INDEX(SORT(list, 1, -1), 1, ),
hideText, IF(ISOMITTED(hide_text), FALSE, hide_text),
IF( hideText, out,
CONCAT("Benefit: ", INDEX(out, 1), " Items: ",
IFERROR(TEXTJOIN(",", 1, DROP(out, , 1)), 0)
)
)
))
It looks complicated but in concept it is easy. I check loop using REDUCE and check if that element is less than the weight limit, if so then I 'add' it to the list and re-call itself with the remainder of the elements. Each time I add a row and each row sums the Benefit in the first element. The last step is to sort by the Benefit and only Take/Return the highest value and pass that up the chain. There are a number of extra IFs and such because of making sure I don't strip off all the elements and such and in the beginning I check if the array is in a row or a column and swap it to make then columns if needed.
EDIT: I found some bug where the subcall must return an error or something in certain cases and I added an IFERROR() to the check. I also added some text/formatting that will be added only on the top call if that parameter is missing or set to false.
So I found this to be significantly faster in cases when all 5 objects are selected (i.e. the weights are all set below the total limit). I can only imagine that additional item list would also result in time improvements also.
On final note, this solution intentionally did not attempt to return all potential best solutions and just returns a best solution. In another words if there are more than 1 solution that returns the same maximum benefit, this solution will just return one of them.
Mar 30 2023 06:56 AM
Your solution seems to be the equivalent to the VBA solutions I found in research where 5 levels of nested loops were being used. To your credit, I think your solution is a lot more accurate. I played around with stepping through vba solutions line-by-line because I was wondering about the order the checks were being done in. I bagged the vba stuff when a few of the solutions returned 0 for an answer!
Writing recursive functions is tricky to me because one has to write with a sense of anticipation in mind. If I understand your solution, initiallly the checks on number of rows with Items, Weights, and Benefits is done because the array must be checked after the function has been run to see if the TRANSPOSE is needed. It's obvious the number of rows is 1 on the first pass but not on subsequent runs.
Timings
4-digits |
0.3086568 |
0.3444158 |
0.3086526 |
0.3095298 |
0.3091404 |
Mar 30 2023 08:13 AM
Mar 30 2023 08:16 AM
Mar 30 2023 08:45 AM
Mar 30 2023 08:55 AM - edited Mar 30 2023 08:58 AM
Solution
I think I'd prefer to avoid any recursion or LAMBDAs and resort to some good old-fashioned matrix multiplication!
=LET(
α,benefit,
β,weight,
γ,COUNT(α),
δ,MOD(INT(SEQUENCE(2^γ,,0)/2^SEQUENCE(,γ,0)),2),
ε,MMULT(δ,TOCOL(α)),
ζ,MMULT(δ,TOCOL(β)),
η,ζ<=cap,
TEXTJOIN(",",,
FILTER(item,INDEX(δ,MATCH(MAX(FILTER(ε,η)),IF(η,ε),0)),0)
)
)
No doubt this could be refined, though the basic outline is there.
Mar 30 2023 03:30 PM - edited Mar 30 2023 04:05 PM
VERY nice. that delta variable that creates the full combination matrix is what we've been missing! I modified the file and formulas to do some testing and went out to 20 items!!! This version out performed the recursion by more than 2.5:1
I also have to look at the recursion because I'm getting some errors and incorrect values from it ... (or maybe just throw it away since this solution is better in each way)
btw here is the updated formula where I break the last line up to store the "mask" from delta and then can easily apply it to weights, and items
=IF(D10,
LET(
to,NOW(),
α,benefit,
β,weight,
γ,COUNT(α),
δ,MOD(INT(SEQUENCE(2^γ,,0)/2^SEQUENCE(,γ,0)),2),
ε,MMULT(δ,TOCOL(α)),
ζ,MMULT(δ,TOCOL(β)),
η,ζ<=cap,
benefitOut, MAX(FILTER(ε,η)),
ansMask,INDEX(δ,MATCH(benefitOut,IF(η,ε),0),),
itemOut,FILTER(item,ansMask),
weightOut, SUM(FILTER(β, ansMask )),
out,HSTACK((NOW()-to)*24*3600, "Benefit: " & benefitOut, "Weight: " & weightOut, "Items: " & TEXTJOIN(",",,itemOut )),
out
),
"not run")
I also have my poor-mans-timer and a 'toggle' to easily turn the calculation on/off
Mar 31 2023 06:37 AM - edited Mar 31 2023 06:39 AM
@JosWoolley 's method of generating the combination matrix is fascinating for how efficient of a solution it is for a complex problem. From what I gather, it's based somewhat in binary (putting the 0 1 in 0 1 Knapsack!).
This produces the combination matrix before MMULT gets to it:
=MOD(INT(SEQUENCE(2^5,,0)/2^SEQUENCE(,5,0)),2)
I did some playing around with engineering functions.
Formula in K:
=BIN2DEC(I2)
Formula in N:
=DEC2BIN(K2,5)
Mar 31 2023 07:47 AM - edited Mar 31 2023 08:12 AM
This is an interesting approach, but you have a limitation with binary numbers, the maximum number of digits is 10. Other than that it is great for generating the sequence and then splitting it:
=MID(DEC2BIN(SEQUENCE(2^5-1),5),SEQUENCE(,5),1)
Mar 31 2023 09:59 AM
I've seen some workarounds using concatenation to get around that 10 digit limit.
Here's my second take on Knapsack.
CombinMatrix
=LET(
n, COUNT(item),
combinations, SUM(COMBIN(n, SEQUENCE(n))),
MAKEARRAY(combinations, n, LAMBDA(r, c, 1 * MID(DEC2BIN(r, 5), c, 1)))
)
Sheet level formula:
=LET(
filtered, FILTER(CombinMatrix, MMULT(CombinMatrix, TOCOL(weight)) <= cap),
b, MMULT(filtered, TOCOL(benefit)),
stack, HSTACK(filtered, b),
TAKE(DROP(SORT(stack, COUNT(item) + 1, -1), , -1), 1)
)
I've added an Include line:
Mar 31 2023 02:07 PM
I nearly failed the test by forgetting to include a Lambda function!
Identifying the combinations was done through the application of binary numbers (not as Excel text though)
Combinationλ(n)
= LAMBDA(n,
LET(
k, SEQUENCE(1, 2 ^ n),
p, 2 ^ SEQUENCE(n, 1, 0),
SIGN(BITAND(k, p))
)
)
Apr 01 2023 09:47 AM
Apr 08 2023 04:30 AM - edited Apr 08 2023 04:31 AM
A similar problem to the Knapsack is the Cutting Stock Problem:
https://en.wikipedia.org/wiki/Cutting_stock_problem
I see one of the methods for solving the problem is generating all possible combinations. The problem with using this method in Excel is the row limit/memory concerns. The Cutting Stock Problem seems to require the provided data be unstacked before combinations can be generated. However, generating all possible combinations for 200 items, pick 3 doesn't seem do-able. Other approaches involve first solving the problem as best as possible and then re-checking the values to make improvements where needed. Sounds like lots of recursion!
Mar 30 2023 08:55 AM - edited Mar 30 2023 08:58 AM
Solution
I think I'd prefer to avoid any recursion or LAMBDAs and resort to some good old-fashioned matrix multiplication!
=LET(
α,benefit,
β,weight,
γ,COUNT(α),
δ,MOD(INT(SEQUENCE(2^γ,,0)/2^SEQUENCE(,γ,0)),2),
ε,MMULT(δ,TOCOL(α)),
ζ,MMULT(δ,TOCOL(β)),
η,ζ<=cap,
TEXTJOIN(",",,
FILTER(item,INDEX(δ,MATCH(MAX(FILTER(ε,η)),IF(η,ε),0)),0)
)
)
No doubt this could be refined, though the basic outline is there.