Forum Discussion

Jeffrey T's avatar
Jeffrey T
Copper Contributor
Dec 15, 2022

Find and replace comma separated values

Hello Everyone,

 

I was provided an Excel sheet that has users in one column and the companies they belong to in another column, some with more than one group.

 

usernamecompany
john.smithThe Tire Store, Garage  Hut, Nova Barns
jane.doeFree Store

The issue is the company names are not the actual names but partial name. I have another Excel sheet with the right names and their company codes.

CompanyCodeCompanyName
TTSThe True Tire Store
RGHThe Real Garage Hut
TFSThe Free Store

I would like to be able to do a fuzzy lookup and replace of the values from the first table in the values from the second table. I was thinking of a fuzzy lookup and replace or something along those lines.

Any assistance is much appreciated.

  • Lorenzo's avatar
    Lorenzo
    Silver Contributor
    Just realized there's a little mistake (unecessary step) in the query I earlier shared. This doesn't seem to matter though...
  • flexyourdata's avatar
    flexyourdata
    Iron Contributor

    Jeffrey T 

     

    You can try using the Fuzzy matching option in the Merge Queries function in Power Query.

     

    I created two queries for your tables - data and lookup (respectively).

     

    Then this:

     

    let
        Source = data,
    
        split 
            = Table.ExpandListColumn(
                Table.TransformColumns(
                    Source,
                    {{"company", Splitter.SplitTextByDelimiter(", ")}}),
                "company"
            ),
    
        fuzzy 
            = Table.FuzzyNestedJoin(
                split,{"company"}, 
                lookup, {"CompanyName"}, 
                "lookup",
                JoinKind.LeftOuter, 
                [IgnoreCase=true, IgnoreSpace=true, NumberOfMatches=1, Threshold=0.5]
            ),
    
        expand 
            = Table.ExpandTableColumn(
                fuzzy,
                "lookup",
                {"CompanyCode", "CompanyName"},
                {"CompanyCode", "CompanyName"}
            ),
    
        mergedcompany 
            = Table.AddColumn(
                expand,
                "mergedCompanyName",
                each if [CompanyName] = null then [company] else [CompanyName]
            ),
    
        group 
            = Table.Group(
                mergedcompany,
                {"username"},
                {
                    {"Original", each Text.Combine([company],", ")},
                    {"Replaced", each Text.Combine([mergedCompanyName],", ")}
                }
            )
    in
        group

     

    This is what the result looks like:

     

     

     

     

Resources