Forum Discussion
restats
Jul 30, 2023Copper Contributor
Extracting Records
How can I extract the records that are not included in two databases?
In combining the two databases Excel allows me to delete duplicates but can I use this to eliminate all that are two of a kind leaving me only the records not in both?
Thanks in advance.
restats
There are some addresses that should be the same but differ slightly:
10101 NE 46th Ave Vancouver, WA 98686 10101 NE 46th Ave, Vancouver, WA 98686 One option is to add helper columns:
AD2 contains the formula =TRIM(SUBSTITUTE(G2, ",", ""))
This eliminates differences such as comma vs no comma and double space vs single space.
It does not eliminate spelling differences.
AE2 contains the formula =COUNTIF(AD:AD, AD2)=1
This returns TRUE if the helper address is unique.
You can then filter on TRUE and copy the selected rows to another sheet, either manually using AutoFilter or using Advanced Filter or using the FILTER function:
=FILTER(CJun23!A2:AD1323,CJun23!AE2:AE1323)
- flexyourdataIron Contributor
Suppose you have two tables with address information and differing attributes attached to each.
If you load each as a query in Power Query, such that you have:
Then you can use a query similar to this to remove the duplicates while retaining those that are unique (including the additional attributes):
let DeduplicateOn = {"Address","City"}, Source = Table.Combine({Query_Address1, Query_Address2}), Group = Table.Group(Source, DeduplicateOn, {{"Rows",each Table.RowCount(_)}}), Filter = Table.RemoveColumns(Table.SelectRows(Group,each [Rows]=1),{"Rows"}), Result = Table.Join(Source,DeduplicateOn,Filter,DeduplicateOn) in Result
The result being:
- restatsCopper Contributorhttps://1drv.ms/x/s!Arvr7szuU5p4hNM4HW3Gh0B_M_C1Fw?e=QRfiaX
Here is a file that I am working with.
Thanks
- flexyourdataIron Contributor
Given there's not much detail to go on, you can try something similar to this:
=UNIQUE(VSTACK(Table1,Table2),,TRUE)
As shown here:
The important part is the TRUE for the third argument to UNIQUE. This will ensure that you only get results that appear exactly once in the stacked tables.
- restatsCopper ContributorLet me correct this query because my data base is made up of properties. What I want to eliminate are the records that have the same addresses even though the other fields may have some different values. I attempted to use the Unique function but learned that the record fields must all be the same for elimination.
Could you attach a small sample workbook demonstrating the problem (without sensitive data), or if that is not possible, make it available through OneDrive, Google Drive, Dropbox or similar?
- restatsCopper Contributorhttps://1drv.ms/x/s!Arvr7szuU5p4hNM4HW3Gh0B_M_C1Fw?e=QRfiaX
Here is the database