SOLVED

Distribution of one number over different categories with specific probabilities per category

Copper Contributor

Hi all,

 

I've been trying to create a formula to distribute a given amount of points over multiple different categories with different probabilities for each individual category.

So for example 10 points over 6 different categories with 50% probability for each point to be added to category 1, 20% for category 2, 10% for categories 3 and 4, and 5% for categories 5 and 6.

 

I appreciate any ideas!

 

8 Replies

@Lowbaer Taking a wild guess now, but perhaps something like this?

Screenshot 2024-05-12 at 16.15.09.png

 

Formula was B3 dragged across to G3. Can be dragged down if needed.

 

Thanks for the idea! It would work perfectly to distribute the points over the categories at different proportions, but it lacks the element of randomness. I'd love to have some sort of random generator, which distributes the points over the categories with different probabilities per category. I've tried to play with if, rand and rank formulas but couldn't get anything to work that way. 

best response confirmed by Lowbaer (Copper Contributor)
Solution

@Lowbaer OK, so you want to generate 6 random percentages that add up to 100%.

In older versions of Excel, use =RAND() in 6 cells and then express each of these as a percentage of their sum. Or if you have a modern Excel version that supports dynamic arrays you could use something like below to spill 6 percentages in one go.

=LET(
    r, RANDARRAY(1, 6),
    s, SUM(r),
    TRANSPOSE(
        INDEX(r, 1, SEQUENCE(6)) / s
    )
)

 

The attached file contains both solutions.

Thats pretty much it! I've taken your formula of the first solution, added a range to the random formula around my desired percentage per category and multiplied the resulting "controlled" random percentages with the total amount of points to be distributed. This does the job, thank you!
Or you can do it like this: generate a random number between 0 and 100 ten times. For each time, if that number lies in the range of [0, 50], add one point to cat1; if that number lies between 50 and 70, add one point to cat2; if that number is between 70 and 80; add one point to cat3; if the random number lies between 80 and 90, add one point to cat4, if the number lies between 90 and 95, add one point to cat5. if the number lies between 95 to 100, add one point cat6. the excel formula would be =FREQUENCY(RANDARRAY(10,1,0,100,FALSE),{50;70;80;90;95})
This is an even more elegant solution, thanks!

@rachel 

I missed the possibility of using FREQUENCY!

@Lowbaer 

It turns out that this is somewhat roundabout as a way of generating frequencies!  What I did was to generate a possible result that assigns each of 10 objects to a category with the relevant probabilities

 

= LET(
    cumulative, SCAN(0, Probability, SUM),
    result, INDEX(Category, XMATCH(RANDARRAY(N),cumulative,1)),
    result
  )

 

Only then did it sink in that you wanted to know the frequency of each category, not the result itself.

My solution was to count the occurrences of each category

 

= LET(
    cumulative, SCAN(0, Probability, SUM),
    result,     INDEX(Category, XMATCH(RANDARRAY(N),cumulative,1)),
    counts,     MAP(Category, LAMBDA(cat, SUM(SIGN(result=cat)))),
    HSTACK(Category, counts)
  )

 

but to go back to the formula that generates the result and calculate frequencies directly is more efficient

 

= LET(
    cumulative, SCAN(0, Probability, SUM),
    HSTACK(Category, DROP(FREQUENCY(RANDARRAY(N),cumulative),-1))
  )

 

Note1:  The 'cumulative' probability gives the array constant in @rachel's formula.

Note2:  DROP is used to remove the count of values exceeding 100%

 

I was just trying to create an event that happens with 50% probability, 20% probability... etc. Continuous uniform distribution was the first thing that comes to my mind. I even try to google whether RANDARRAY indeed generates N numbers of i.i.d X~U[0, 100].....
1 best response

Accepted Solutions
best response confirmed by Lowbaer (Copper Contributor)
Solution

@Lowbaer OK, so you want to generate 6 random percentages that add up to 100%.

In older versions of Excel, use =RAND() in 6 cells and then express each of these as a percentage of their sum. Or if you have a modern Excel version that supports dynamic arrays you could use something like below to spill 6 percentages in one go.

=LET(
    r, RANDARRAY(1, 6),
    s, SUM(r),
    TRANSPOSE(
        INDEX(r, 1, SEQUENCE(6)) / s
    )
)

 

The attached file contains both solutions.

View solution in original post