Spread a value proportionally across several cells (with rounding)

Copper Contributor

Hi, I'm looking to do following but am stuck:

  • Calculate the sum of a set of numbers. (These are laid out in a column.)
  • Calculate a % of that sum.
  • Round that % up to the nearest (in this case) 0.5.
  • Spread that % out proportionally in a column next to the original one (but in increments of - in this case - 0.5).

I need advice from an Excel ninja.  It's probably clearer if I show what I'm doing currently and why it's not right:

MattF1000_3-1726660393813.png

I'm comparing the no.s highlighted in green.  The total for Value 2 is way higher than 25% of the Value 1 total due to multiple rounding-up errors.

 

I realise I could get the two numbers a lot closer if I rounded to the NEAREST no. on each row instead of rounding UP, but I don't think that guarantees me a result where the 2 green no.s match every time. 

 

Is there a way to take that 1st green no. and spread it proportionally across the Value 2 column, rounding each Value 2 row up or down as needed, as per the unscientifically fudged version below? 

MattF1000_4-1726661190016.png

 

Thanks in advance, appreciated :smile:

 

[EDIT] I have a potential workaround but it's not perfect, and I haven't tested it yet:  

  • Round each per-stage Value 2 to the nearest 0.5 (instead of rounding UP to the nearest 0.5) in a reference column. This should (?) get me closer to the overall correct figure.
  • Create a 2nd reference column which is identical to the 1st reference column.
  • Adjust the formula in row 1 of the 2nd reference column, to incorporate the delta between the '25% of Value 1, rounded up to nearest 0.5' value and the total of the 1st reference column.  Basically I use this cell to smooth out any difference in the totals.

I suspect I'll come unstuck here as the value in row 1 is usually 0.5, which doesn't give me much to play with.  I might get away with increasing it slightly but I can't get away with rounding it down to zero (or lower!).

11 Replies
this is not an Excel question as much as a math question. I don't see how to do your magic distribution because I don't see any rule(s) you used. For example if you replace 1-6 all with 4.5 then the total is 27 and the 25% rounded up is still 7. So the 25% of each value is 1.125 so how would you propose to distribute the 7 across the 6 values in that case? If you can explain a fixed rule we can tell you how to make it is Excel but without those rules I don't see a way. I could propose that you round up until you can't anymore so basically in my example all of them round to 1 but then you have 1.0 left over so the 1st 2 would be 1.5 and the rest 1.0. but in your example I couldn't see any rules to apply

@m_tarler Thanks for the response.  Re. this - 'I don't see how to do your magic distribution because I don't see any rule(s) you used. '...  I've shown the rule/formula in the first screenshot so I guess you're talking about the 2nd screenshot?  I haven't used any rules there, I've manually fudged some no.s as evenly as I could manage to show example output.  I'm looking for suggestions on maths rules/excel formulae to automate that output with the fairest/most proportional distribution possible. 

 

Re. this - 'For example if you replace 1-6 all with 4.5 then the total is 27 and the 25% rounded up is still 7. So the 25% of each value is 1.125 so how would you propose to distribute the 7 across the 6 values in that case?'...  That's a good example.  I'd want 2 of the 6 values to equal 1.5, and the remaining 4 values to equal 1.  As for how they're spread, ideally I'd put the higher values at the top of the column (but I'm open to suggestions):

MattF1000_0-1726667443623.png

@MattF1000  OK so here is my attempt for you and I will walk through what I did and logic:

=LET(in, P10:P15, fract, 0.25, minInc, 0.5,
   distribute, LAMBDA(in_rng,fraction,min_inc, LET(
                  n, ROWS(in_rng),
                  target, MROUND(SUM(in_rng)*fraction+min_inc/2,min_inc),
                  in_rnd_down, IFERROR(MROUND(in_rng*fraction-min_inc/2,min_inc),0),
                  count_up, (target-SUM(in_rnd_down))/min_inc,
                  add_array, EXPAND(EXPAND(SEQUENCE(count_up,1,0,0),count_up,2,min_inc),n,2,0),
                  rank_mods, SORTBY(HSTACK(SEQUENCE(n),in_rnd_down),MOD(in_rng*fraction,min_inc),-1),
                  out, TAKE(SORT(rank_mods+add_array),,-1),
                  out)),
   distribute(in,fract,minInc))

so first off I'm assuming you have excel365 and this formula uses new functions like LET and LAMBDA.

so on line 1 are just the inputs for in=range to look at, fract=the fraction to multiply by, minInc=the min round off multiple to round up to.

line 2 defines the LAMBDA which is the whole of the calculation so I would actually recommend you actually insert this LAMBDA into your NAME MANAGER (Formula -> Name Manager then click new name then for 'Name' type "distribute" and then for 'Refers to' keep the "=" and then paste everything from line 2 starting with LAMBDA through line 10 except the last comma 

If you define this LAMBDA in your Name Manager than you can just use

=distribute(P10:P15, 0.25, 0.5)

in the cell instead of all the above text.

As for what this will do:

line 3 just finds how many entries

line 4 finds the target value by sum of all values, then * fraction then round up to nearest increment

line 5 finds all individual values * fraction but round DOWN (some will have to so start there)

line 6 finds difference between sum of all round down and target to find how MANY can be round UP

line 7 creates a matrix to add that min increment to 'round up' those values.  basically lists the min increment the number of times found in line 6

line 8 then ranks and sorts all the round DOWN values based on how far away the actual divisions are from the min increment (i.e. rank by closest to round up as highest priority to round up and closest to round down as lowest priority and equal value stay in same order)

line 9 then adds the 'round up' list of values from 7 to this sorted martix to round up and then will resort the list based on the 1st column that was added so we can return to original order and then output only the actual values (line 10)

I did some minimal testing and it seems to work pretty well 🙂

 

 

@MattF1000 

 

I'm intrigued enough to jump in, hopefully extending the questions asked by my on-line friend @m_tarler : it might help those of us here in the forum if you explained not just the what that you're doing but also the why.

 

In your reply to Matt's question you state "As for how they're spread, ideally I'd put the higher values at the top of the column..."    The use of the word "ideally" caught my eye. What makes that arrangement more "ideal"? What is it about the big picture here that would lead you to say that?

 

Most important, what IS the big picture? What do these numbers represent in the real world? Or are you just playing with numbers in some utterly arbitrary way?

@mathetes 

Hi mathetes. My team uses a cost model spreadsheet to calculate the amount of project management days (Value 2) which should be allocated against a specified amount of engineering days (Value 1). The default policy (at least, once you exceed a certain quantity of engineering days) is to provide a number of project management days based on 25% of the engineering days. 

 

The existing spreadsheet consistently overcooks the no. of project management days, as it calculates them on a per-project-stage (Row#) basis, and rounds up each time. (There's other variables which also impact on the calculation, inc. multiple engineering tabs with their own dedicated project management calcs, but lets not go there.)  I've been updating a few things but this is one problem I've ducked until now.

 

I say 'ideally' as the policy is somewhat arbitrary, is subject to adjustment on a case-by-case basis, isn't being applied correctly at the moment, in 90% of cases the per-stage breakdown of days doesn't technically matter (although explaining this to other parts of the business could cause me problems), and in the case of large projects the rounding errors are less significant.  Currently we either run with the overcooked value, or manually adjust.   Obviously automated is better, and as long as the overall project management effort adds up and *appears* to be spread roughly proportionally per stage, I'm good.

 

The current spreadsheet involves a fair few tabs, tables, named ranges, data validation, XLOOKUPS, pivot tables, one recorded macro with a button to update pivots (no other VBA), that sort of stuff... I'd guess it's Intermediate+ level excel, but there's nothing super advanced that a teammate couldn't understand if they put their mind to it.  It needs to avoid going into super-advanced territory otherwise this stops being a spreadsheet and becomes unsupportable shadow IT.  @m_tarler's suggestion is a work of art but no-one in the team would understand it or be able to troubleshoot it. 

 

I suspect I'm looking for a simple answer where there isn't one, I'd be OK with a more arbitrary/less accurate solution if it was easier.  I've updated my original post with a possible workaround but I suspect the results will be problematic.

@m_tarler that is genius. Unfortunately it's too genius for my scenario (see context in reply to mathetes), but I'm bookmarking this is case I can use it in future. Very much appreciate your response, particularly the walk-through, thankyou for this!

@MattF1000 

I say 'ideally' as the policy is somewhat arbitrary, is subject to adjustment on a case-by-case basis, isn't being applied correctly at the moment, in 90% of cases the per-stage breakdown of days doesn't technically matter ... , and in the case of large projects the rounding errors are less significant.  Currently we either run with the overcooked value, or manually adjust.   Obviously automated is better, and as long as the overall project management effort adds up and *appears* to be spread roughly proportionally per stage, I'm good.

 

I'm not so sure, given what you wrote earlier in that same paragraph, that automation is obviously better. Especially if the number of projects is as finite as what you have shown in this series of messages. What could be automated (via some conditional formatting, perhaps) is the highlighting of discrepancies that need to be addressed, facilitating manual adjustment. Until such time as you've built AI into the process, being able to consider non-quantitative dimensions and make manual adjustments could be a good thing.

 

The current spreadsheet involves a fair few tabs, tables, named ranges, data validation, XLOOKUPS, pivot tables, one recorded macro with a button to update pivots (no other VBA), that sort of stuff... I'd guess it's Intermediate+ level excel, but there's nothing super advanced that a teammate couldn't understand if they put their mind to it.

 

The attached article is nearly two decades old, but contains some principles of spreadsheet design that you might find helpful. [I say this solely on the basis of your quick description, so you're surely free to disregard this and any other unsolicited advice.] Excel, as you know, is incredibly powerful. As such, it's not uncommon to see people designing things that mis-use (or even abuse) that power, making tasks more complicated than they need to be, designing tables and tabs that are based on the multiple ledger sheets or PERT/GANTT charts that were used in "the old days" before 1-2-3 and Excel. I don't know that this applies to you, but reading about good design is never an exercise in futility.

@mathetes Well in the spirit of "mis-use (or even abuse) that power, making tasks more complicated than they need to be" I have attached a sample spreadsheet using both the LAMBDA I proposed and using 2 helper columns that basically re-do the functionality of that Lambda.  So the 3 columns (2 helper and final) are:

25% down - this will take the 25% and round DOWN to nearest increment

formula: =IFERROR(MROUND(B2:B7/4-0.25, 0.5),0)

remainder - this will take the remainder after 25% to the nearest increment and applies a small offset based on the row number just so each should have a unique RANK

formula: =MOD(B2:B7/4,0.5)-ROW(B2:B7)/10000000

25% adj - this is the final value where if its remainder is in the top X rank it will round UP

formula: =G2#+0.5*(RANK(H2#,H2#,0)<=(B10-SUM(G2#))/0.5)

 

note in this workbook I added the 'distribute' function into the name functions so in column E you can see how that cell equation is simply =distribute(B2:B7,0.25,0.5)

@mathetes 

Thanks mathetes. I'm not in a position to roll it out (yet?) but I've built my own replacement spreadsheet from scratch (based loosely on someone else's well designed xls that we used at a previous company). It's a lot more simplified/streamlined vs the current one (single ledger sheet which removes the need for a load of workarounds/complexity etc)... looks like I'm on the right lines but that article will be a massive help/peace of mind.

Conditional formatting with an 'insert your own manual no.' column next to an unrounded calculation column is definitely an option. It would require an additional manual check each time the figures are updated but that might be workable. Really appreciate your responses, thanks.

@m_tarler  this is great, thanks!  I can definitely use this (at least, I can use a version of the helper column version).  I've spent my entire Sunday trying to take this apart and rebuild it so I can better understand it... with partial success...

 

I've dumbed it down a little (see Calc 2 in attached) by amending the formulae and breaking them out into as many separate columns as I could.  This makes it easier for me to follow.  If I can beg a little more help:

 

Does what I've done make sense?


Have I missed anything by swapping the MROUND function for FLOOR?  (It seems a lot simpler.  And I don't think I want the IFERROR wrap; this would cause an accidental minus input to default to a zero result, and I want an error message to be displayed instead.)

 

This was my introduction to using # cell references with ranges and spills (thanks!!)...  I've dumbed things down a little more in Calc 3 in attached) with dynamic arrays replaced with direct cell/range references for comparison.  The #/range/spill versions are a lot more elegant than Calc 3, but are there any other strong reasons to avoid using Calc3?  (The priorities are stability, simplicity, and performance, in that order.  I suspect the dynamic arrays increase stability?  NB I might be varying the no. of project stages in a future version but for now they'll be fixed at 6.) 

 

Should I be converting this into a Table?

 

I'm planning to save the calculation in a protected worksheet to prevent accidents.   I've had a test run at protecting the sheet and it seems fine, are you aware of any potential issues? 


And now the annoying bit I didn't mention before! I'm trying to work around the following requirements, which mess up a very nice solution:

  • For historical and process reasons I should always have a value >0 (ideally a minimum of 0.5) in the final column for the first row/project stage. 
  • IDEALLY (!) I should always have a value >0 (ideally a minimum of 0.5) in the final column for ANY row/project stage where engineering days > 0. 

The additional purple-highlighted workaround columns fulfil these requirements, but the result gets progressively less accurate with small no.s of engineering days in each row, as the rounding errors creep back in.  NB:  

  • If total engineering days <=5, a different policy kicks in and I can apply an =IFS to apply that policy instead of the ~25% policy, making the problem go away.
  • But for example with total engineering days =5.5, with engineering days in every row, I'm back to square 1 with the rounding issues (see example saved in Calc 2). 

So... I'm considering a couple of workaround options:

  1. Manually force the Row 1 result to be 0.5 days (it'll never be higher than this anyway), and allow some other row results to round down to zero.
  2. Change my per-row rounding to 0.25 increments.  As long as Row 1 >0, and as long as the overall total rounds up to a multiple of 0.5, I should get away with this.

I suspect other options/variants are available and may be easier.  What would you recommend, and how would you adjust the xls to implement it?


Again, I really appreciate your help (and time), thanks.

I will try to answer best I can.
I think FLOOR is an improvement. I forgot that it will do multiples (I always think of it for integers)
I think using the # reference is better because it will auto expand/contract if the spill range changes. also note that excel will automatically replace a selected range that matches a spill range with the # notation. But in the end I see it has no other improvement and a direct reference is fine (until the size changes)
as for making it a table I don't think you will be able to with spill equations. you will have to change them to not spill
Protecting a sheet is good idea when other will be using it and you don't want them to accidentally break it. But not that the protection, no matter how clever you tweak it, should never be used to hide or protect any truely confidential or private information (I don't think that is an issue here).
an alternative way to make a min value is using MAX( [min val] , [value] )
yes as soon as you start requiring round up you will have problems (that is where this post started). and I've never been a fan of "never" but you could make the first one a MIN of 0.5 and then let the rest fall as they may. You might also be able to encourage the 0s to inc up to 0.5 over others by adding a conditional onto the remainder column that if it is a 0 then add 0.5 to the remainder and then those will always be prioritized over the others but if all are 0 then it will prioritize the 0s that are larger and then finally in order from top to bottom. And finally going to 0.25 increments might help but the rules will still apply and probably still be needed - basically yes the round up will be less but the undershoot is less also and you will still need on average 1/2 over and 1/2 under.