SOLVED

Excel 2016 how to autofill data entries based on a reference sheet

Copper Contributor

I have an Excel document where I input monthly sales to analyze via pivot tables and graphs. The software I export sales data from exports it in the following way:

MONTH | PRODUCT ID | PRODUCT NAME | SALES REP | QUANTITY | VALUE |

However, I need two more categories which the software does not export and are therefore empty when I paste new data:

MONTH | PROD. ID | PROD. NAME | SALES REP | QUANTITY | VALUE | CATEGORY | BRAND

Currently, I input them manually by dragging down. As there are thousands of rows to input monthly, this is getting more and more time consuming. I want to create a sort of "reference sheet" and then use that so that whenever I paste rows with products' names and IDs, the Category and Brand columns will be autofilled with the information I have given in the reference sheet. This sheet would have the following heading for all the products:

PRODUCT ID | PRODUCT NAME | CATEGORY | BRAND

How would the code for this look? I have never worked with Power Query, so if this is the way to go, I am open to suggestions.

Any help would be great.

 

7 Replies
If your export sales data are in Sheet1 and the lookup data are in Sheet2, this formula in Sheet1!G2, copied down rows and across to H2, will return your expected results:
=VLOOKUP(B2,
Sheet2!$A:$D,
COLUMN(C$1),0)
Instead of Sheet2!$A:$D, I suggest that you define a name for that dynamic range. The defined name might be "Products", with this formula:
=Sheet2!$A$2:INDEX(Sheet2!$A:$D,COUNTA(Sheet2!$A:$A),COUNTA(Sheet2!$1:$1))
Thereafter, you may instead use this formula in Sheet1!G2, copied down rows and across to H2:
=VLOOKUP(B2,
Products,
COLUMN(C$1),0)

@dzhogov 

You are correct in identifying Power Query as the way to go; PQ is purpose-written for this type of task.

1. First turn your data ranges into Tables Sales and Brand, say.  

2. Go to the data tab and import the Sales table (Get Data / from Table)

3. From the PQ editor, Close and Load as a connection only

4. Repeat for the Brand Table

5. From the PQ editor select Merge Query

6. Select the Sales and Brand tables and select the Product ID to link the tables

7. Close and load to a new table

This should represent a huge step forward from manual processes but further steps will be worth considering once you are confident of what you are doing.

1. It may be possible for PQ to access the sales data from an external source

2. The data from each table could be loaded individually to the data model

3. Power Pivot could be used to present information drawn from the linked tables.

Power Pivot is capable of handling many more records than could be loaded into an Excel worksheet so would provide some 'future proofing'.

Thank you for your detailed response @Peter Bartholomew!

Just to make sure I understand your suggestion: Say I complete these steps for all the data I have up until now and end up with a third table (from step 7) to use for analysis and charts. When another month's data is ready to import, I would basically have a new "Sales" table (once again missing Brand and Category fields). I would then need to merge that with the same old Brand table (containing the category and brand fields for each product ID) and then paste this data onto table 3. Correct?

Also, do you suggest I start off with these tables (especially the third table resulting from the merge and the Brand table used as a 'reference') in the same workbook?
@Twifoo

Thank you for your response. This is easily applicable to my case and I will use it until I am more confident with Power Query and look into a more advanced solution.
best response confirmed by dzhogov (Copper Contributor)
Solution

@dzhogov 

The workbook you would be building is the one with the pivot tables.  The third table (or the same information held within the data model for Power Query) would be there and would be refreshed monthly (all the previously defined steps  would be repeated automatically).  New data might replace the old or you could choose to append it.

 

One objective would be to get hold of the sales data from as near the source as you are allowed to go (in the current workbook, in another workbook, within an exported text file, using an SQL query to interrogate a corporate database).  All the data manipulation would be within PQ so that you never need to touch the data again.  The category and brand data could be read from a separate, largely static file or could be a table in the destination workbook. 

 

p.s. By 'refresh' I simply mean click the 'Refresh All' button on the Data ribbon tab.

p.p.s. I have tried to append a file as the 'starter for one'.  It contains the three tables on a single sheet but that would not be the longer-term aim.  Open the queries from the panel on the right and step through the edits to see the data at each stage.

@Peter Bartholomew 

 

Thank you for the example and great explanation!

 

Now that I have a better understanding of this process, a few additional questions arise. In the actual merging process the join kind I used was "Left outer (all from first, matching from second)." Oddly, my Sales table has 11'966 rows. Once I close and load the Merge table to a new sheet (without selecting "Add to data model", the resulting table has 12'565 rows loaded. I added another month's data to the Sales table, refreshed the Merge table, and again, more rows appeared in the latter (12'798 vs 13'443) The total sums of Quantity and Value are, naturally, different as well. Any idea as to what the reason might be?

 

Secondly, I was wondering if it is an issue for me to change the date format in my Merge table (outside PQ, as the format I prefer is "mmm-yy," not available among the PQ date formats) and add an additional column where I copy the Date column but change its format to "yyyy-mm" using a formula (this is exclusively for a slicer's sake later on and is not used in any pivot table).

 

EDIT: There are not duplicate rows in the Sales table.

@dzhogov 

1. The left outer issue is a mystery.  I agree that you should not finish with more records than there were in the left table, though I am not sure what happens if there is more than one record in the right hand table that matches one of the foreign keys.

 

2. With dates you need to take care that the representation is what you need

2.2018 [decimal number]

1/2/2018 [date/non-US]

Feb-2018 [text]

Once a proper date is output to Excel, you can apply number formatting and I believe it will persist through refresh.

 

3. You can add columns within PQ.  If you use 'by example' and enter a couple of example outputs PQ will often generate the process instruction for you and fill the column.

1 best response

Accepted Solutions
best response confirmed by dzhogov (Copper Contributor)
Solution

@dzhogov 

The workbook you would be building is the one with the pivot tables.  The third table (or the same information held within the data model for Power Query) would be there and would be refreshed monthly (all the previously defined steps  would be repeated automatically).  New data might replace the old or you could choose to append it.

 

One objective would be to get hold of the sales data from as near the source as you are allowed to go (in the current workbook, in another workbook, within an exported text file, using an SQL query to interrogate a corporate database).  All the data manipulation would be within PQ so that you never need to touch the data again.  The category and brand data could be read from a separate, largely static file or could be a table in the destination workbook. 

 

p.s. By 'refresh' I simply mean click the 'Refresh All' button on the Data ribbon tab.

p.p.s. I have tried to append a file as the 'starter for one'.  It contains the three tables on a single sheet but that would not be the longer-term aim.  Open the queries from the panel on the right and step through the edits to see the data at each stage.

View solution in original post