04-03-2019 01:55 AM - edited 04-03-2019 02:01 AM
04-03-2019 01:55 AM - edited 04-03-2019 02:01 AM
I have an Excel document where I input monthly sales to analyze via pivot tables and graphs. The software I export sales data from exports it in the following way:
MONTH | PRODUCT ID | PRODUCT NAME | SALES REP | QUANTITY | VALUE |
However, I need two more categories which the software does not export and are therefore empty when I paste new data:
MONTH | PROD. ID | PROD. NAME | SALES REP | QUANTITY | VALUE | CATEGORY | BRAND
Currently, I input them manually by dragging down. As there are thousands of rows to input monthly, this is getting more and more time consuming. I want to create a sort of "reference sheet" and then use that so that whenever I paste rows with products' names and IDs, the Category and Brand columns will be autofilled with the information I have given in the reference sheet. This sheet would have the following heading for all the products:
PRODUCT ID | PRODUCT NAME | CATEGORY | BRAND
How would the code for this look? I have never worked with Power Query, so if this is the way to go, I am open to suggestions.
Any help would be great.
04-03-2019 02:31 AM
04-03-2019 02:52 AM - edited 04-03-2019 02:56 AM
You are correct in identifying Power Query as the way to go; PQ is purpose-written for this type of task.
1. First turn your data ranges into Tables Sales and Brand, say.
2. Go to the data tab and import the Sales table (Get Data / from Table)
3. From the PQ editor, Close and Load as a connection only
4. Repeat for the Brand Table
5. From the PQ editor select Merge Query
6. Select the Sales and Brand tables and select the Product ID to link the tables
7. Close and load to a new table
This should represent a huge step forward from manual processes but further steps will be worth considering once you are confident of what you are doing.
1. It may be possible for PQ to access the sales data from an external source
2. The data from each table could be loaded individually to the data model
3. Power Pivot could be used to present information drawn from the linked tables.
Power Pivot is capable of handling many more records than could be loaded into an Excel worksheet so would provide some 'future proofing'.
04-03-2019 04:35 AM
04-03-2019 04:40 AM
04-03-2019 07:08 AM - edited 04-03-2019 07:24 AMSolution
The workbook you would be building is the one with the pivot tables. The third table (or the same information held within the data model for Power Query) would be there and would be refreshed monthly (all the previously defined steps would be repeated automatically). New data might replace the old or you could choose to append it.
One objective would be to get hold of the sales data from as near the source as you are allowed to go (in the current workbook, in another workbook, within an exported text file, using an SQL query to interrogate a corporate database). All the data manipulation would be within PQ so that you never need to touch the data again. The category and brand data could be read from a separate, largely static file or could be a table in the destination workbook.
p.s. By 'refresh' I simply mean click the 'Refresh All' button on the Data ribbon tab.
p.p.s. I have tried to append a file as the 'starter for one'. It contains the three tables on a single sheet but that would not be the longer-term aim. Open the queries from the panel on the right and step through the edits to see the data at each stage.
04-04-2019 12:15 AM - edited 04-04-2019 12:48 AM
Thank you for the example and great explanation!
Now that I have a better understanding of this process, a few additional questions arise. In the actual merging process the join kind I used was "Left outer (all from first, matching from second)." Oddly, my Sales table has 11'966 rows. Once I close and load the Merge table to a new sheet (without selecting "Add to data model", the resulting table has 12'565 rows loaded. I added another month's data to the Sales table, refreshed the Merge table, and again, more rows appeared in the latter (12'798 vs 13'443) The total sums of Quantity and Value are, naturally, different as well. Any idea as to what the reason might be?
Secondly, I was wondering if it is an issue for me to change the date format in my Merge table (outside PQ, as the format I prefer is "mmm-yy," not available among the PQ date formats) and add an additional column where I copy the Date column but change its format to "yyyy-mm" using a formula (this is exclusively for a slicer's sake later on and is not used in any pivot table).
EDIT: There are not duplicate rows in the Sales table.
04-04-2019 02:49 AM - edited 04-04-2019 02:50 AM
1. The left outer issue is a mystery. I agree that you should not finish with more records than there were in the left table, though I am not sure what happens if there is more than one record in the right hand table that matches one of the foreign keys.
2. With dates you need to take care that the representation is what you need
2.2018 [decimal number]
Once a proper date is output to Excel, you can apply number formatting and I believe it will persist through refresh.
3. You can add columns within PQ. If you use 'by example' and enter a couple of example outputs PQ will often generate the process instruction for you and fill the column.