Forum Discussion
Extracting Records
- Jul 30, 2023
There are some addresses that should be the same but differ slightly:
10101 NE 46th Ave Vancouver, WA 98686 10101 NE 46th Ave, Vancouver, WA 98686 One option is to add helper columns:
AD2 contains the formula =TRIM(SUBSTITUTE(G2, ",", ""))
This eliminates differences such as comma vs no comma and double space vs single space.
It does not eliminate spelling differences.
AE2 contains the formula =COUNTIF(AD:AD, AD2)=1
This returns TRUE if the helper address is unique.
You can then filter on TRUE and copy the selected rows to another sheet, either manually using AutoFilter or using Advanced Filter or using the FILTER function:
=FILTER(CJun23!A2:AD1323,CJun23!AE2:AE1323)
Suppose you have two tables with address information and differing attributes attached to each.
If you load each as a query in Power Query, such that you have:
Then you can use a query similar to this to remove the duplicates while retaining those that are unique (including the additional attributes):
let
DeduplicateOn = {"Address","City"},
Source = Table.Combine({Query_Address1, Query_Address2}),
Group = Table.Group(Source, DeduplicateOn, {{"Rows",each Table.RowCount(_)}}),
Filter = Table.RemoveColumns(Table.SelectRows(Group,each [Rows]=1),{"Rows"}),
Result = Table.Join(Source,DeduplicateOn,Filter,DeduplicateOn)
in
Result
The result being:
Here is a file that I am working with.
Thanks