Comparing Similar tables in excel - Power Query?

Copper Contributor

I want to take two similar but not the same tables, compare them, and generate 3 to four spreadsheets that can quickly and concisely identify the differences between the two tables - finer details regarding this are below. Using Office 365 latest version I am told Power Query would be good for this type of thing. I am happy to just be pointed at relevant information to learn how to do this, will only send an example spreadsheet to anyone who can help with this question as I do not want to post it publicly here, thanks.

 

I get a cable data listing for an electrical project. This is re-issued as design or specs change over time.

Column A relates to a given Cable identification number. The rest of the columns relate to particulars for each cable - what type it is, what equipment it runs from and to, and what room or space that equipment is in.

When a new sheet (table) is issued, it may contain new cables that were not on the older version of the sheet. It may also be missing cables from the older sheet if they were no longer required and deleted from the project.

Additionally - a cable type may be changed to better suit the job it is to do, for example an engineer may request a revised larger rating cable type be used. (columns B-C)

Then the equipment it runs to or from may change location to a different room.(Columns E-F and G-H)

A cable may be run for example from one switchboard in one location then design decides to run it from a different switchboard found in a different location, all of these variables will be in the one row for the given cable which would be different from old to new sheet.

 

I need to very quickly be able to establish a comparison between the two tables:

a) What cables are the same in both tables

b) What are new cables in the new table

c) What are deleted cables from the old spreadsheet that no longer appear in the new table

 

Once that query is performed I need to establish

1) Are there any differences in the row contents for any cables that are common to both sheets

 

Currently I tend to insert columns, copy from one sheet to the other and use conditional formatting to highlight duplicates. You can imagine with 4200+ cables in column A, and 8 odd columns to compare how long this would take. Especially if there are differences between the tables as you have to match the column A contents first. Hoping someone can understand what I am saying and maybe point me in the right direction to learn this. Thank you.

4 Replies

OK. Below I detail a bit how I went about solving this. I would still like to know if anyone can tell me -

Can I choose which results will create new worksheets out of the queries pane table results in the PQ editor, and can I name the new worksheets it will create before close and load so they do not appear as Sheet 1, Sheet 2, Sheet 3 etc within the existing workbook?

 

Quick learner but maybe not the best or most efficient way to compare two tables, I'm not sure but I achieved what I wanted WAY quicker than I had been doing it.

I loaded the two tables to one workbook on separate sheets, and named them as per below.

I'm a bit of an excel dummy, so written for people like me:

  • Office 365 Excel - data tab. Choose one  cell in one of the two tables to compare - for me I called them NewData & OldData and named their respective sheets to match.
  • Far left for me, choose get data from table/range - power query editor opens with a sheet that is whichever of the two sheets you chose, I used a cell within NewData.
  • At left is a hidden queries pane, I clicked that to expand, right clicked NewData and chose Duplicate.
  • Choose the duplicated result and at the right in query settings for the duplicated result, click source
  • In formula pane(Menu View- check box if there is no formula pane) where it showed "NewData" I changed to "OldData" - names of the tables not the worksheets. The contents changed to reflect the table OldData
  • At the top menu - I chose drop down Merge Queries as new, merge dialog opens
  • Top drop down menu - NewData, select first column. Second drop down menu select OldData and also select it's first column
  • Below at Join Kind drop down I chose left anti - then OK
  • BOOM! New table created called Merge1 in the queries pane listing all the New Cables I wanted to identify and only them - Requirement 1 check.
  • Repeat above steps but at top drop down menu of merge dialog, chose OldData, NewData at second one, choose both first columns again and Left anti join kind again, Boom! Now I have the shortlist of all first column entries that had been removed from the OldData in the new table

I went on to convert original Old/NewData sheets, all to data, create a custom column and combine for both NewData and OldData sheets, appended the tables and close and loaded. This produced a table that merged all contents of both tables into one table on one sheet.

By then going to the last combined column, selecting the contents and removing duplicates, then sorting the first column and highlighting duplicates in first column I got a look instantly at the results where anything at all had changed for the row values that were common to both sheets.

If you ever need to do this and this last bit is unclear I can try to detail a bit better - Though I'm wondering if I am the only one who ever needed to do this and got sick of inserting columns and copy/pastes to see differences between two data sets. With over 4200 rows and 13 odd columns to compare, this is a great solution that is working for me.

Hope it can help someone else!

@Jan Karel Pieterse  Thank you for the link, I could not open it due to some policy settings regarding downloading macros without determining it was safe or something - work computer.

I did manage to learn a way to do this, I posted a reply detailing how I did it with Power Query. It is good to know how to do it yourself so you can customize which columns are relevant for your comparisons too I think, thanks again for the link though. Much appreciated.

Your description above is more or less exactly what my little tool automates. Shame on your IT dept for blocking content that would have saved you a lot of time I'm sure. Perhaps you can ask them to except my domain from their blocking policy?