Forum Discussion
Combine Similar Matching Rows Into One and Add Column Using Power Query
I have two data sources (XLSX and CSV) that have been loaded into Power Query, had transformations applied to them, and then merged based on the Asset and Path columns. Asset and Hits are a part of the CSV file and Title and Path are from the XLSX. The CSV file cuts off data after 35 characters so the cell might only contain partial information. There is also dirty data that has been cleaned up, resulting in duplicate row items, even though they are similar. See the example below:
The two Assets are the same but because of the cut off and dirty data, they are listed as two different items. What I am trying to do is combine these duplicates into one row, with the Hits added together, so the merge finds the corresponding path.
I tried to keep only the first 66 characters but there are other assets in this list that rely on the characters past 66 to be unique.
1 Reply
Take this:
Step 1: Load and Clean Both Sources
- Load both CSV and XLSX into Power Query.
- Apply your existing transformations (trimming, cleaning, etc.).
Step 2: Use Fuzzy Merge
If the Asset values are similar but not identical:
- Use Fuzzy Merge between CSV and XLSX on Asset and Path.
- Enable Fuzzy Matching and adjust the similarity threshold.
- This helps match truncated or dirty Asset names.
Step 3: Group and Aggregate
- In the CSV query:
- Go to Transform > Group By
- Group by Path and a cleaned version of Asset (e.g., first 35 characters or a normalized key)
- Use Sum on Hits
- You can also create a custom grouping key:
- Text.Lower(Text.Start([Asset], 35))
This helps group similar truncated assets.
Step 4: Merge with XLSX
- Merge the grouped CSV with the XLSX query on Path.
- Expand Title and other columns as needed.
Step 5: Final Cleanup
- Remove intermediate columns (like grouping keys).
- Rename columns for clarity.
- Ensure no duplicates remain.