Find and replace comma separated values

Copper Contributor

Hello Everyone,

 

I was provided an Excel sheet that has users in one column and the companies they belong to in another column, some with more than one group.

 

usernamecompany
john.smithThe Tire Store, Garage  Hut, Nova Barns
jane.doeFree Store

The issue is the company names are not the actual names but partial name. I have another Excel sheet with the right names and their company codes.

CompanyCodeCompanyName
TTSThe True Tire Store
RGHThe Real Garage Hut
TFSThe Free Store

I would like to be able to do a fuzzy lookup and replace of the values from the first table in the values from the second table. I was thinking of a fuzzy lookup and replace or something along those lines.

Any assistance is much appreciated.

3 Replies

@Jeffrey T 

 

You can try using the Fuzzy matching option in the Merge Queries function in Power Query.

 

I created two queries for your tables - data and lookup (respectively).

 

Then this:

 

let
    Source = data,

    split 
        = Table.ExpandListColumn(
            Table.TransformColumns(
                Source,
                {{"company", Splitter.SplitTextByDelimiter(", ")}}),
            "company"
        ),

    fuzzy 
        = Table.FuzzyNestedJoin(
            split,{"company"}, 
            lookup, {"CompanyName"}, 
            "lookup",
            JoinKind.LeftOuter, 
            [IgnoreCase=true, IgnoreSpace=true, NumberOfMatches=1, Threshold=0.5]
        ),

    expand 
        = Table.ExpandTableColumn(
            fuzzy,
            "lookup",
            {"CompanyCode", "CompanyName"},
            {"CompanyCode", "CompanyName"}
        ),

    mergedcompany 
        = Table.AddColumn(
            expand,
            "mergedCompanyName",
            each if [CompanyName] = null then [company] else [CompanyName]
        ),

    group 
        = Table.Group(
            mergedcompany,
            {"username"},
            {
                {"Original", each Text.Combine([company],", ")},
                {"Replaced", each Text.Combine([mergedCompanyName],", ")}
            }
        )
in
    group

 

This is what the result looks like:

 

flexyourdata_1-1671127066656.png

 

 

 

Hi @Jeffrey T 

 

A possible approach with Power Query function Table.FuzzyNestedJoin and options:
    IgnoreCase=true, IgnoreSpace=true, Threshold=0.5

If more than 1 match the query keeps the one with highest similarity

 

Sans titre.png

Just realized there's a little mistake (unecessary step) in the query I earlier shared. This doesn't seem to matter though...