Keep duplicate values in Power Query to run Pivot Table

Copper Contributor

Hi all,

 

In the last month I have run an automated report to keep track on some performance analysis.

I'm using Power Query in order to to get data automatically from two different sources. The two queries are:

- A: it includes the name of the product, the SKU of the product, the category code and the sale date

- B: it includes the name of the product, the SKU of the product, the category code, the sale date, the quantity sold and the revenue

 

Through Power Pivot I then organize my data into a pivot table that - by filtering by date or category - gives me the all the info I need.

 

The problem I need help for occurs when a line appears more than one time in my A query (all the values are the same despite the date). In fact, in these cases, when I try to update my query and run my pivot table, the error message "the column contains a duplicate value and this is not allowed for columns on the one side of many to one relationship or for columns that are used as the primary key of a table" pops up.

 

Just to sum up, at the end, I would like to have multiple lines with the same value in pivot, in case a specific products has been sold in different days.

 

Thank you all in advance!

 

Best,


Lorenzo

 

 

 

2 Replies

@LorenzoMolteni 

Lorenzo,

 

It's not clear what is the relationship between tables A and B. In table A you have products attributes (SKU, name, category), it's not clear why do you keep the date here.

In table B there are product SKU, date of sale, qty and cost. Why do you repeat category, name and code here?

 

If remove extra data you may build one to many relationship between A and B ( so called dim table and fact table).

 

In addition you need one more table, Calendar, which could be built automatically within Power Pivot, but better to generate by another query. It will be one to many with table B.

 

With that model it shall be no conflicts with reporting PivotTables.

@Sergei Baklan 

Hi Sergei,

 

Thanks for your reply.

 

I agree on removing unnecessary data from table B. Regarding to the date included in table A, I was not clear on that. So basically, on table A I have the go live date of a specific product and on table B the selling date of that same product (at the end of my analysis it would be ideal to only show sales coming from the same day the product was put online, but that can be done filtering in excel as well). The problem is another one: in fact, since sometimes a product may be put online on day 1, for example, and then again on day 15, my query is only able to process if removing duplicate SKUs. Because of this I'm not able to run performances analysis both for day 1 and 15 but only for day 1 because "the table contains a duplicate value and this is not allowed for columns on the one side of a many-to-one relationship or for columns that are used as the primary key of a table" and so I have to exclude the duplicate SKUs from the second go live date. 

 

Can you please advise on how can I manage this relationship on power query in order to solve this problem?

 

Thank you in advance!

 

Best,


Lorenzo