SOLVED

Filtering Away Certain Duplicate Values in Pivot Data Source

Iron Contributor

Hello,

I'm using Pivot Table and Slicer to filter data from a raw data source. Our employees work rotating shifts. Between those shifts there are overlap hours. I created a file to track their performance during those 9 hours of shifts.

 

Morning shift: 08:00 - 17:00

Mid Shift : 16:00 - 01:00

Night Shift : 23:30 - 08:30

 

I'm using Slicer to select between shifts.

 

Our tool provides the raw data I need without separating shifts. So for overlap hours (16:00 - 17:00) we get duplicate data resulting in employees appearing in both morning and mid shift in the Pivot Table. What I want is to Excel to ignore the second data set for duplicate entries and not include them in the pivot. Is this possible? Or is there any other way?

 

Am I clear enough? If not I can provide more info.

 

I'm attaching the excel file I'm working on for you to see what I'm doing.


17 Replies

@kheldar 

 

Not with your dataset since it cannot actually tell which employee belongs to which actual shift, i.e Employee1 might belong to the morning shift most hours and Employee2 might belong to the mid shift most hours but because of the overlap your dataset only indicates which employees are in overlapping shifts but not actually indicate which actual shift the belong to.

@Yea_So 

 

Thank you for answering. I edited the file including shifts for each employee and made it more concise and readable. Including only 1 set of data for 1 day. Also someone else suggested me using a helper to identify overlap hours and exclude them using slicer but It's not working as intended. It's not my formula.

@kheldar 

 

on the screen shot below how can anyone tell which ones the duplicates are for that employee:

Yea_So_0-1630825738640.png

 

This is very it gets complicated I believe. Employee 1 is on mid shift. so morning and night are duplicates. So If Moderation task of queue 1 is less than lets say %30 of other queue 1s they are duplicate.
If I were to make a wild guess on the screenshot image
I would pick row 2 and row 160 to be the actual shifts, the rest are overlaps.
What are your thoughts on my guess?
Employee 1 is on mid shift. So he appears on both morning and night shift overlaps which I want to exclude him from. I don't know how to write the formula but the logic comes to my mind is that take the max of these values, including only max ignore rest.
best response confirmed by allyreckerman (Microsoft)
Solution

@kheldar 

 

Changed the Name 1 to Name01

Sorted the dataset by Name and No. of Initial Moderation Tasks

created filter columns rank, count, subnum, ref

did a power query on dataset including filter columns

created 2 reference queries Top 2 and Bottoms queries

filtered out bottoms on Top 2 reference query

filtered out Top2 on Bottoms reference query

Yea_So_0-1630868613045.png

Yea_So_1-1630868632786.png

 

cheers

Man that looks awesome. Could you share the file as well?

see the bottom of my previous post

Your next issue would be how to distribute the bottoms to the top 2 and which the bottoms belong to

 

what formula did you use for column Avg. Handling Time(s)

Many thanks man, I'm trying to figure out your logic. I shouldn't ask more of you. There is no formula for AHT. Raw data source provides that.
you better check your dataset since I found straggler formulas in that column by selecting the column and do a find all "="

@kheldar 

the logic is:
rank column = summed up employee01's No. of Initial Moderation Tasks then got the ratio of each Initial Moderation Tasks to get the % so it can be ranked

=Table1[@[No. of Initial Moderation Tasks]]/SUMIF(Table1[Name],AB2,Table1[No. of Initial Moderation Tasks])

count = get the count of instances of each employee so each row can be assigned a subnumber so then the top 2 can be separated from the bottom Ns

some employees have 1, some have 2, some have 3, some have 4, some have 5 and some have 6 instances

So in order to get the top 2 I filtered out the subnumber 3,4,5, and 6

and vice versa for the bottoms 

 

The Top 2 are the actual shifts

The Bottoms are the overlaps

 

To figure out the distribution of the overlaps,

add all the No. of Initial Moderation Tasks in the Bottoms Sheet which will be the 100%

 

Add the Avg. Handling Time(s) in the Bottoms sheet then divide it by 60 seconds 

Amendment: 

Add the Avg. Handling Time(s) in the Bottoms sheet then divide it by 60 minutes

 

then divide the sum of No. of Initial Moderation Tasks in the Bottoms Sheet
by the quotient Avg. Handling Time(s) in the Bottoms sheet 

So now you have the No. of Initial Moderation Tasks in the Bottoms Sheet per minute

Multiply it by the overlap Hours/Minutes to get the distribution for each shift

 

then calculate ratio the shift hours:

Morning shift: 08:00 - 17:00

Mid Shift : 16:00 - 01:00

Night Shift : 23:30 - 08:30

 

Morning to Mid Shift is 1 hour

Mid Shift to Night Shift is 30 minutes

Night Shift to Morning Shift is 30 minutes

Thank you man! I'm not at the level of fully comprehending your explanation yet, however, I'll consult a friend. I'll hit you up again, if we can't figure it, of course if that's okay for you.
Yes do consult your friend to get a second opinion

cheers

@kheldar 

 

The bottoms calculation for the distribution is just an average ballpark since that is what your current dataset allows (an educated guess), unless you have a more granular dataset which specifically gives you the actual No. of Initial Moderation Tasks in the Bottoms Sheet stats for each shift.

Don't get me wrong. It's not about getting a second opinion. It's about understanding the way you constructed this :)

@kheldar 

 

I'm glad that helped you

1 best response

Accepted Solutions
best response confirmed by allyreckerman (Microsoft)
Solution

@kheldar 

 

Changed the Name 1 to Name01

Sorted the dataset by Name and No. of Initial Moderation Tasks

created filter columns rank, count, subnum, ref

did a power query on dataset including filter columns

created 2 reference queries Top 2 and Bottoms queries

filtered out bottoms on Top 2 reference query

filtered out Top2 on Bottoms reference query

Yea_So_0-1630868613045.png

Yea_So_1-1630868632786.png

 

cheers

View solution in original post