Jul 30 2023 12:55 PM
How can I extract the records that are not included in two databases?
In combining the two databases Excel allows me to delete duplicates but can I use this to eliminate all that are two of a kind leaving me only the records not in both?
Thanks in advance.
restats
Jul 30 2023 01:03 PM
Jul 30 2023 01:06 PM
Could you attach a small sample workbook demonstrating the problem (without sensitive data), or if that is not possible, make it available through OneDrive, Google Drive, Dropbox or similar?
Jul 30 2023 01:08 PM
Given there's not much detail to go on, you can try something similar to this:
=UNIQUE(VSTACK(Table1,Table2),,TRUE)
As shown here:
The important part is the TRUE for the third argument to UNIQUE. This will ensure that you only get results that appear exactly once in the stacked tables.
Jul 30 2023 01:20 PM
Suppose you have two tables with address information and differing attributes attached to each.
If you load each as a query in Power Query, such that you have:
Then you can use a query similar to this to remove the duplicates while retaining those that are unique (including the additional attributes):
let
DeduplicateOn = {"Address","City"},
Source = Table.Combine({Query_Address1, Query_Address2}),
Group = Table.Group(Source, DeduplicateOn, {{"Rows",each Table.RowCount(_)}}),
Filter = Table.RemoveColumns(Table.SelectRows(Group,each [Rows]=1),{"Rows"}),
Result = Table.Join(Source,DeduplicateOn,Filter,DeduplicateOn)
in
Result
The result being:
Jul 30 2023 01:31 PM
Jul 30 2023 01:32 PM
Jul 30 2023 03:09 PM
SolutionThere are some addresses that should be the same but differ slightly:
10101 NE 46th Ave Vancouver, WA 98686 |
10101 NE 46th Ave, Vancouver, WA 98686 |
One option is to add helper columns:
AD2 contains the formula =TRIM(SUBSTITUTE(G2, ",", ""))
This eliminates differences such as comma vs no comma and double space vs single space.
It does not eliminate spelling differences.
AE2 contains the formula =COUNTIF(AD:AD, AD2)=1
This returns TRUE if the helper address is unique.
You can then filter on TRUE and copy the selected rows to another sheet, either manually using AutoFilter or using Advanced Filter or using the FILTER function:
=FILTER(CJun23!A2:AD1323,CJun23!AE2:AE1323)