SOLVED

Extracting Records

Copper Contributor

How can I extract the records that are not included in two databases?

In combining the two databases Excel allows me to delete duplicates but can I use this to eliminate all that are two of a kind leaving me only the records not in both?

Thanks in advance.

restats

8 Replies
Let me correct this query because my data base is made up of properties. What I want to eliminate are the records that have the same addresses even though the other fields may have some different values. I attempted to use the Unique function but learned that the record fields must all be the same for elimination.

@restats 

Could you attach a small sample workbook demonstrating the problem (without sensitive data), or if that is not possible, make it available through OneDrive, Google Drive, Dropbox or similar?

@restats 

 

Given there's not much detail to go on, you can try something similar to this:

 

=UNIQUE(VSTACK(Table1,Table2),,TRUE)

 

As shown here:

 

flexyourdata_0-1690747667765.png

 

The important part is the TRUE for the third argument to UNIQUE. This will ensure that you only get results that appear exactly once in the stacked tables. 

@restats 

 

Suppose you have two tables with address information and differing attributes attached to each. 

 

flexyourdata_0-1690748319518.png

 

If you load each as a query in Power Query, such that you have:

 

flexyourdata_1-1690748349550.png

 

Then you can use a query similar to this to remove the duplicates while retaining those that are unique (including the additional attributes):

 

let
    DeduplicateOn = {"Address","City"},
    Source = Table.Combine({Query_Address1, Query_Address2}),
    Group = Table.Group(Source, DeduplicateOn, {{"Rows",each Table.RowCount(_)}}),
    Filter = Table.RemoveColumns(Table.SelectRows(Group,each [Rows]=1),{"Rows"}),
    Result = Table.Join(Source,DeduplicateOn,Filter,DeduplicateOn)
in
    Result

 

The result being:

 

flexyourdata_2-1690748425667.png

 

https://1drv.ms/x/s!Arvr7szuU5p4hNM4HW3Gh0B_M_C1Fw?e=QRfiaX
Here is a file that I am working with.
Thanks
best response confirmed by restats (Copper Contributor)
Solution

@restats 

There are some addresses that should be the same but differ slightly:

10101 NE 46th Ave  Vancouver, WA 98686 
10101 NE 46th Ave, Vancouver, WA 98686

One option is to add helper columns:

HansVogelaar_0-1690754644936.png

AD2 contains the formula =TRIM(SUBSTITUTE(G2, ",", ""))

This eliminates differences such as comma vs no comma and double space vs single space.

It does not eliminate spelling differences.

AE2 contains the formula =COUNTIF(AD:AD, AD2)=1

This returns TRUE if the helper address is unique.

You can then filter on TRUE and copy the selected rows to another sheet, either manually using AutoFilter or using Advanced Filter or using the FILTER function:

 

=FILTER(CJun23!A2:AD1323,CJun23!AE2:AE1323)

 

@Hans Vogelaar 

Thank you

1 best response

Accepted Solutions
best response confirmed by restats (Copper Contributor)
Solution

@restats 

There are some addresses that should be the same but differ slightly:

10101 NE 46th Ave  Vancouver, WA 98686 
10101 NE 46th Ave, Vancouver, WA 98686

One option is to add helper columns:

HansVogelaar_0-1690754644936.png

AD2 contains the formula =TRIM(SUBSTITUTE(G2, ",", ""))

This eliminates differences such as comma vs no comma and double space vs single space.

It does not eliminate spelling differences.

AE2 contains the formula =COUNTIF(AD:AD, AD2)=1

This returns TRUE if the helper address is unique.

You can then filter on TRUE and copy the selected rows to another sheet, either manually using AutoFilter or using Advanced Filter or using the FILTER function:

 

=FILTER(CJun23!A2:AD1323,CJun23!AE2:AE1323)

 

View solution in original post