Forum Discussion
bsrujan022
Jul 05, 2023Copper Contributor
Quintile division help
Hi team, Hope you're doing great. So, I have a question. I'm trying to automate a similar dataset, where I try to categorize the skillset based on their scores while making sure that I only calc...
- Jul 06, 2023I'm sorry I'm a bit confused. You are saying the formulas are middle based? I'm also not sure what you mean by pivot table making colors. You can use Conditional Formatting to color cells based on many things including top X%. In pivot tables you can make a calculated field and potentially use that as a way to create columns. As for the formulas I created, they don't care how many cells or how many categories. As for being 'biased' toward the middle or high I don't understand what you mean/want. In terms of rank based grouping that will 'bias' by only 1 or multiple of the same value but on large set shouldn't be noticeable/significant. In terms of value based grouping that will be entirely dependent on the value distribution.
mtarler
Jul 05, 2023Silver Contributor
bsrujan022 so the issue I have here is understanding how to break them into A,B,C,D and what you want. For example I did 2 different versions in the attached. In 1 case I used the built in PERCENTRANK.EXC function and the other I used my own calculation based on the value and the max-min range. The methods yield different results and depend on how you want to define the groups.
- bsrujan022Jul 05, 2023Copper ContributorHi mtarler, thank you. So, I was looking for categorizing the names for skills based on their scores into quintiles. So, we have an option in Pivot to categorize based on the Top 10%, 20%, and so on in conditional formatting. Can I do something there so I could get them into A, B, C, D instead of colors?
- mtarlerJul 06, 2023Silver ContributorHi. did you look at the attached worksheet? Do either or both of those options work for you? If not what did you want different? My point is both of those solutions do what you ask but as you noted, the skill 'Draw' has 5 so how do they get spread into 4 categories? How does top 25% get defined? is it based on rank or value? is it an inclusive or exclusive set? If you don't care about those nuances then either solution will work for you, but the results will be slightly different.
- bsrujan022Jul 06, 2023Copper ContributorHi mtarler. I understand. I checked both of your solutions and proposed to the team. Seems, they're trying to weigh more based on the highest scorers. Since this was just a sample, you're not seeing multiple entries. Actual dataset has like 45 different skills, and over 90000 rows worth of data. When working there, I noticed that classifying based on both your formulas, I was able to see categories divided and somewhat biased to medium categories and not on top scorers. I then remembered, in Pivot table, we've option to format top 25% or bottom 25%. And so on for 50%. If we're able to put the rules in order, we'll get based on who scored highest in top 25%, then next being 25 to 50% and so on.. also, it works when new skills are getting added and actually it gets more dynamic with new rows.
So, is it possible to let's say, top 25% are formatted as Green color, can we create a new column in pivot which can say green is A, Yellow (26-50%) is B, Orange (51-75%) is C and last being D? Just checking since that'll make it easy and format friendly even if we try to add new columns in the raw data in future since lot of people use it.