Consolidating data from different sources

Copper Contributor

Hi all, 

I've just joined the community and had a look but can't see a previous answer. I have 2 sources of data that I need to consolidate into a single master workbook, but the data updates each month.

For example

Dataset1 - Customer contact details (includes customer id)

Dataset2 - monthly sales (using customer id)

So I want to consolidate these to MasterData.

But each month I want to add the new monthly data - so Master data has for example 12 months of sales but Dataset2 is only of the most recent month.

 

Any tips on how I can automate this? I 

8 Replies

@Paul_icapture_app 

 

From the point of view of database design, you're actually better off with two, at least as you describe it. The principle is called "data normalization," and the least complicated way to describe it is that you eliminate as much redundancy as possible. It's OK (more than that, it's necessary) to have something like Customer ID as the basis for joining the two, when needed, but that should be the limit.

 

You don't explain WHY you are seeking to consolidate the two into a single Master file, so now that I've given a reason why you might be better served by keeping two data sets, could I ask you to explain the goal you have in mind for consolidation. It might well be that there's a different way to accomplish that goal.

 

(I am assuming that the data set of monthly sales results does contain history of prior months; if not, that certainly would be worth doing. I'd just hesitate putting the customer contact info in with that.)

Thanks for the response. So the current monthly sales report comes from a different system and although I can pull multiple months I run additional analysis within the master document to show customer trends, month by month analysis etc. Importing just the most recent months data seemed the easiest way to increment the data without having to double check a whole load of the analytics formulae we use.

 

The customer contact data is really to be used so we can run campaigns based on the results. So for example, if I can see a customer purchases every 3 months, I can run a campaign to target them in 2 months time.

 

At the moment I would have to export the customerID list, then do a consolidate with dataset1 and then run a mailmerge (for example) of the results. If I can merge the datasets I just reduce about 5 steps in my current process.

 

Hope that makes sense?

@Paul_icapture_app 

If your monthly files have the same structure you may combine them with Power Query keeping in one folder, merge with Dataset1, add other transformation of needed and return result into the Master file. Here make analysis on consolidated data, preferably through data model. After that the process of updating will be adding in another sales file to the folder and clicking on Refresh All in Master file.

@Paul_icapture_appit makes sense to a certain extent. I hope you can run with Sergei's suggestion.

 

My point on keeping the customer contact info in its own separate file is that you can always combine that single record with multiple records or with the analysis of those multiple months worth of sales, and do so in a mail merge. It gets complicated, but mail merge (in Word, I presume) can do quite a bit like that (although it's been literally decades since I did so myself)...   You can do your analysis of a given customer's ordering patterns in the sales database, and just merge in the contact info at time of contact. Because customer ID is part of the sales record, you don't need to redundantly have it in every row of the customer sales activity table.

@Sergei Baklan Thanks, might have to do a crash course in Power Query. I have looked at it for other projects but not had the time/determination to really get stuck in.

@Paul_icapture_app Yes, learning of Power Query is never ending process (as any other technology), but even after some initial investment of time you could start easy to do some tasks which are really complex if use formulas.

@Sergei Baklan   Sadly, Power Query is not available (so far as I can tell) on Excel for the Mac.......which is why I still rely on the D_____ database functions.

@Paul_icapture_app 

 

Some good advice given by@mathetes  and @Sergei Baklan.

 

I think as long as you keep the data clean (Normalized with no dupes or redundancy) then Excel offers a lot of flexibility (Especially if you go with Power Query).

My recommendation for getting started in Power Query.  Pick a few simple tasks to get started like 1. un-pivoting data and 2. combining multiple sheets into 1.