Feb 15 2021 11:12 PM
Feb 15 2021 11:12 PM
new to this so bear with me.
I got a big Excel sheet with about 5000 row entries and 10 columns of user data.
Things like, Name, Company Name, ID, Zip-Code, you name it, casual stuff.
Since this is user entry based there might be:
1. Spelling mistakes of the Company name
2. Different people use different shortcuts for company names
3. Some company names are used differently ( Dodger Highball Stadium for 1 person and the other enters Highball Stad Dod) for example.
Since i use this data in PowerBI i need the example from 3. to match each others name.
So both are called the actual name, Dodger Highball Stadium to use the data further.
Is there a way with pattern matching or something like this to try to get things to work?
Much appreciated to any information which may lead me to a solution
Thanks in advance!
Feb 16 2021 01:08 AM - edited Feb 16 2021 03:45 AMSolution
@DanielWWW Did some testing with Fuzzy Matching in Power Query. In order to match "Dodger Highball Stadium" with "Highball Stad Dod" you need to set the similarity threshold to 0.18. This means that you are going to end up with a lot of other matches as well. Entries like "Stadium Rd." or "Dodge Motor Company" or "Highba Inc.) will also be matched. Anything with large parts of either of the three words seems to be matched. Thus not very useful.
So, you could perhaps use fuzzy matching with a higher threshold that will allow for small spelling errors. But to capture the more extreme variations, I fear you are going to need human brain power and some manual work to set up a transformation table with all possible references to one and the same location.
Attaching my test file. Perhaps useful.
Feb 16 2021 01:56 AM
Thank you Riny for the fast answer already.
I will try it out and test a little bit and will let you know.
Feb 16 2021 02:13 AM
You can use the fuzzy match in Power Query for this issue, what I would suggest you use the FUZZY match with transformation table where you can input the value FROM and TO handle this scenario.
Regards, Faraz Shaikh | MCT, MIE, MOS Master, Excel Expert
If you find the above solution resolved your query don't forget mark as Official/Best Answer & like it to help the other members find it more.