Forum Discussion
3zafandi
May 31, 2023Copper Contributor
Merging duplicate entries without loosing data
So I have an excel file with over 4000 entries. Each entry has an ID on the fourth column "D", and base on this column there are many entries with the same ID (aka duplicates). Each copy of duplicates may have columns with the same data and columns with different data (ex: some are empty cells in one copy, but filled in another copy).
So my objective is to merge each duplicated row (based on ID) without losing any data. I don't want to just delete the duplicates and for example keep the first copy of each set, because the second or third copy may have data or columns not filled in the first copy.
Also if it is the same exact data (ex: lets say in column labeled "degree" both copy 1 and 2 have data "Bachelor") only write it once, otherwise merge them with a comma or semicolon in between.
Is this at all possible to do without having to manually go through each entry?
Thanks
- Faisal1775Brass Contributor
When you have duplicate entries in a dataset and you want to merge them without losing any data, there are a few steps you can follow:
Identify the duplicate entries: Start by identifying the duplicate entries in your dataset. Look for fields or criteria that determine the duplication.
Determine the desired merged result: Decide how you want to merge the duplicate entries. Consider which fields should be prioritized or combined, and how the merged entry should look.
Create a new entry: Create a new entry in your dataset to hold the merged data. This entry will combine the information from the duplicate entries.
Merge the data: Copy the relevant data from each duplicate entry into the new merged entry. You may need to make decisions on how to handle conflicts or inconsistencies between the duplicate entries.
Preserve unique data: If there are fields or data unique to each duplicate entry, consider appending that data to the merged entry. You may choose to include it as separate fields or consolidate it in a meaningful way.
Update references: If other records or systems reference the duplicate entries, make sure to update those references to point to the new merged entry.
Delete duplicate entries: Once you have merged the data and verified the accuracy of the merged entry, you can safely delete the duplicate entries from your dataset.
It's important to exercise caution and create backups of your data before performing any merging or deletion operations. This ensures that you have a safety net in case any unexpected issues arise during the process.
- LorenzoSilver Contributor
You probably have an option with Power Query - see attached sample that shouldn't be far from what you expect assuming I understood it all
- 3zafandiCopper ContributorYes this exactly what I need but on a larger scale. How do I use it?
- LorenzoSilver Contributor
How do I use it?
#1 Make sure you run Excel > 2013 on Windows
#2 Download the attached file
#3 Rename the columns in the blue Table (in sheet Input) as you want but column ID and add any additional columns you need
#4 Copy/paste your data in the blue Table
#5 Switch to sheet Output
#6 Right-click in the green Table > Refresh