Compare and Validate Data between 2 Files

Copper Contributor

HI All, 

 

I'm looking for some help with one of the projects monthly excel tasks. 

 

So We have a People Roster and this people roster contains names of all the people who have worked and still are working on our project. 

I have attached an example file: 

 

1. People Roster

 

Once a month we request a file from HR this file contains only ACTIVE people who are working on our project. 

I have attached an example file:

 

2. HR Roster

 

What I'm looking for is a couple things:

 

1. I need to compare the data to ensure that the names in file 1. match with names in file 2.

2. If there are differences I need to see where they are. is file 1. missing someone from file 2. or vice versa. 

3. ultimately I would like to automate this cross-check.

 

Thanks for you assistance, 

 

Kris

2 Replies

@TapZxK 

 

Kris -- could we back up a bit first? And I'll give you my own background as the reason for asking if we can back up a bit before we tackle your question.

I retired in 2002 after having served as the Director of the HR and Payroll database project for a major Fortune 50 corporation. One of the things we worked hard to eliminate was having what was essentially the same information stored in two places. For just one example, before our project began both HR (for benefits and other purposes) AND Payroll (for tax withholding and other purposes) had their own sets of address information on each employee. Which meant that when an employee moved--which happened all the time (when you have tens of thousands of employees)--that information had to be entered at least twice, once in HR's database, once in Payroll's--which meant inevitably there were discrepancies, which led to major errors on one or both application ends.

 

A BIG value/goal in good database design goes by the term data integrity. And data integrity is best achieved by NOT having duplicate (redundant) records that are essentially reflecting the same thing.

 

What you have described strikes me as your current effort to achieve data integrity, which is a highly manual process currently (I assume), based on a visual check and verification and correction one way or the other. (You fix your data, or ask HR to fix theirs; again, I'm assuming). You want to automate that. But you're basically talking about fixing a problem--redundancy of data and discrepancies in that data--after it exists.

Better--far better--would be to prevent the discrepancies in the first. In this case, I would hope that HR's data is the more reliable. (If it's not, your company has a major problem)....

 

You may have a local need to track those who aren't part of the project any more, but surely HR also has data on who was once there, has left, whatever. Could you not get from them a list of current and past members of your department (defining the latter in some reasonable way--offboarded within 6/12 months, for example) and import that into your own roster for whatever local management needs you have?

 

Anyway, you see (I hope) some merit in my questioning your basic premise here. I appreciate your need to have accurate data. It just seems (and maybe I've totally mis-read your situation) that you're going about it by trying to put a bandage on the situation, rather than seeking a cure for the underlying condition.

@TapZxK 

 

Kris -- In the attached spreadsheets I've given you a quick and dirty solution, using VLOOKUP to see if there's a match in the other roster, and Conditional Formatting to highlight when there's no match found.

It's quick and dirty totally. Depending on the size of the issue, it would work. As I indicated in my earlier posting, though, I'd strongly recommend trying to get at the deeper issue that causes there to be these discrepancies in the first place.