Filtering Away Certain Duplicate Values in Pivot Data Source

kheldar · ‎Sep 01 2021

Hello,

I'm using Pivot Table and Slicer to filter data from a raw data source. Our employees work rotating shifts. Between those shifts there are overlap hours. I created a file to track their performance during those 9 hours of shifts.

Morning shift: 08:00 - 17:00

Mid Shift : 16:00 - 01:00

Night Shift : 23:30 - 08:30

I'm using Slicer to select between shifts.

Our tool provides the raw data I need without separating shifts. So for overlap hours (16:00 - 17:00) we get duplicate data resulting in employees appearing in both morning and mid shift in the Pivot Table. What I want is to Excel to ignore the second data set for duplicate entries and not include them in the pivot. Is this possible? Or is there any other way?

Am I clear enough? If not I can provide more info.

I'm attaching the excel file I'm working on for you to see what I'm doing.

Yea_So · ‎Sep 01 2021

@kheldar

Not with your dataset since it cannot actually tell which employee belongs to which actual shift, i.e Employee1 might belong to the morning shift most hours and Employee2 might belong to the mid shift most hours but because of the overlap your dataset only indicates which employees are in overlapping shifts but not actually indicate which actual shift the belong to.

kheldar · ‎Sep 02 2021

@Yea_So

Thank you for answering. I edited the file including shifts for each employee and made it more concise and readable. Including only 1 set of data for 1 day. Also someone else suggested me using a helper to identify overlap hours and exclude them using slicer but It's not working as intended. It's not my formula.

Yea_So · ‎Sep 05 2021

@kheldar

on the screen shot below how can anyone tell which ones the duplicates are for that employee:

kheldar · ‎Sep 05 2021

This is very it gets complicated I believe. Employee 1 is on mid shift. so morning and night are duplicates. So If Moderation task of queue 1 is less than lets say %30 of other queue 1s they are duplicate.

Yea_So · ‎Sep 05 2021

If I were to make a wild guess on the screenshot image
I would pick row 2 and row 160 to be the actual shifts, the rest are overlaps.
What are your thoughts on my guess?

kheldar · ‎Sep 05 2021

Employee 1 is on mid shift. So he appears on both morning and night shift overlaps which I want to exclude him from. I don't know how to write the formula but the logic comes to my mind is that take the max of these values, including only max ignore rest.

allyreckerman · ‎Sep 05 2021

@kheldar

Changed the Name 1 to Name01

Sorted the dataset by Name and No. of Initial Moderation Tasks

created filter columns rank, count, subnum, ref

did a power query on dataset including filter columns

created 2 reference queries Top 2 and Bottoms queries

filtered out bottoms on Top 2 reference query

filtered out Top2 on Bottoms reference query

cheers

kheldar · ‎Sep 05 2021

Man that looks awesome. Could you share the file as well?

Yea_So · ‎Sep 05 2021

see the bottom of my previous post

Your next issue would be how to distribute the bottoms to the top 2 and which the bottoms belong to

what formula did you use for column Avg. Handling Time(s)

kheldar · ‎Sep 05 2021

Many thanks man, I'm trying to figure out your logic. I shouldn't ask more of you. There is no formula for AHT. Raw data source provides that.

Yea_So · ‎Sep 05 2021

you better check your dataset since I found straggler formulas in that column by selecting the column and do a find all "="

Yea_So · ‎Sep 05 2021

@kheldar

the logic is:
rank column = summed up employee01's No. of Initial Moderation Tasks then got the ratio of each Initial Moderation Tasks to get the % so it can be ranked

=Table1[@[No. of Initial Moderation Tasks]]/SUMIF(Table1[Name],AB2,Table1[No. of Initial Moderation Tasks])

count = get the count of instances of each employee so each row can be assigned a subnumber so then the top 2 can be separated from the bottom Ns

some employees have 1, some have 2, some have 3, some have 4, some have 5 and some have 6 instances

So in order to get the top 2 I filtered out the subnumber 3,4,5, and 6

and vice versa for the bottoms

The Top 2 are the actual shifts

The Bottoms are the overlaps

To figure out the distribution of the overlaps,

add all the No. of Initial Moderation Tasks in the Bottoms Sheet which will be the 100%

~~Add the Avg. Handling Time(s) in the Bottoms sheet then divide it by 60 seconds~~

Amendment:

Add the Avg. Handling Time(s) in the Bottoms sheet then divide it by 60 minutes

then divide the sum of No. of Initial Moderation Tasks in the Bottoms Sheet

by the quotient Avg. Handling Time(s) in the Bottoms sheet

So now you have the No. of Initial Moderation Tasks in the Bottoms Sheet per minute

Multiply it by the overlap Hours/Minutes to get the distribution for each shift

then calculate ratio the shift hours:

Morning shift: 08:00 - 17:00

Mid Shift : 16:00 - 01:00

Night Shift : 23:30 - 08:30

Morning to Mid Shift is 1 hour

Mid Shift to Night Shift is 30 minutes

Night Shift to Morning Shift is 30 minutes

kheldar · ‎Sep 05 2021

Thank you man! I'm not at the level of fully comprehending your explanation yet, however, I'll consult a friend. I'll hit you up again, if we can't figure it, of course if that's okay for you.

Yea_So · ‎Sep 05 2021

Yes do consult your friend to get a second opinion

cheers

Yea_So · ‎Sep 05 2021

@kheldar

The bottoms calculation for the distribution is just an average ballpark since that is what your current dataset allows (an educated guess), unless you have a more granular dataset which specifically gives you the actual No. of Initial Moderation Tasks in the Bottoms Sheet stats for each shift.

kheldar · ‎Sep 05 2021

Don't get me wrong. It's not about getting a second opinion. It's about understanding the way you constructed this :)

Yea_So · ‎Sep 16 2021

@kheldar

I'm glad that helped you

allyreckerman · ‎Sep 05 2021

@kheldar

Changed the Name 1 to Name01

Sorted the dataset by Name and No. of Initial Moderation Tasks

created filter columns rank, count, subnum, ref

did a power query on dataset including filter columns

created 2 reference queries Top 2 and Bottoms queries

filtered out bottoms on Top 2 reference query

filtered out Top2 on Bottoms reference query

cheers

View solution in original post

Filtering Away Certain Duplicate Values in Pivot Data Source

Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Re: Filtering Away Certain Duplicate Values in Pivot Data Source

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Filtering Away Certain Duplicate Values in Pivot Data Source