Forum Discussion
Combine two files remove duplicate row when two columns have same value
I have a report that runs daily from our ERP that contains shipping history for current month. I need to combine the report that was run today with the report that was run yesterday. My issue is this report is always run from the 1st day of the month to the current day of the month every day. This means there is always duplicates on the report. I would be simple if I could just pick one column and tell excel to remove duplicates but this is a shipping report by line item (material) so there is not just one column that has unique duplicates. There is two columns that I can use to call out the duplicates for each row but just don't know how to do this. The two columns are Delivery and Material. For example the four items at bottom of the report that shipped to Bill's Farm need to be removed since they are duplicates from the top of the report. I want to do this automatically in the background as this is going to be used in a power bi dashboard. I am now just working on building this out and considering using Power Automate to merge the files together in the background.
Thanks for your help
Delivery | Act. GdsDate | Name of the ship-to party | Material |
88146226 | 8/16/2024 | Bill's Farm | 23504PTM |
88146226 | 8/16/2024 | Bill's Farm | 23605PTM |
88146237 | 8/16/2024 | Freddy's Garage | 23502PTM |
88146226 | 8/16/2024 | Bill's Farm | 23504PTM |
88146226 | 8/16/2024 | Bill's Farm | 23605PTM |
3 Replies
As variant you may keep all daily reports in some folder / SharePoint folder; use Power Query with From (SharePoint) Folder connector to combine them, apply Remove Duplicates to the result.
If result is only for Power BI dashboard, perhaps you don't need to build resulting Excel file. Dataflow is more practical variant. You may refresh it on schedule or/and by Power Automate.
- mrgamaddenCopper Contributor
SergeiBaklan i am not aware of how to use the that you are suggesting in SharePoint or in PBI that would allow me to remove duplicates from two columns. I’m only aware of it being able to remove duplicates from one column. Please on how to remove duplicates as i Originally asked in my post
Power Query for it
let Source = Excel.CurrentWorkbook(){[Name="Source"]}[Content], #"Removed Duplicates" = Table.Distinct(Source) in #"Removed Duplicates"
which gives