Forum Discussion
Building relationships in data model to leverage power pivot - circumventing unique key ID issue
When building relationships between two data tables in the data model of excel, you require a unique ID / key to relate the two datasets. This makes sense but I find it has an issue:
It requires you to reduce the entirety of your dataset down to unique ID's in one of your tables so that it can do the 'one to many' relationship build. If you don't have unique ID's / key, the relationship between two tables often doesn't work. Yes you can use an intermediary table with unique ID's to link the two complete tables, but when using that intermediary table I find you are limited in what additional data rows you can bring into the combined power pivot table to provide context to the relationship. Once you start bringing in contextual rows from the separate tables, there ends up being a lot of repeated values for any numerical figure from the other table, which you are trying to aggregate. An example below of the power pivot table result (data from table 1 in blue, from table 2 in green)
Product ID (Unique Key) Service Line Sales Amount COGS
169AKY T Shirts $1,000 $7,500
169AKY Sweaters $2,000 $7,500
169AKY Socks $3,000 $7,500
169AKY Turtle necks $4,000 $7,500
169AKY Pants $5,000 $7,500
COGS from table 2 also has the data broken out by Service Line (T-shirts, Sweaters, etc.), but will only show an aggregate COGS value for all service lines for each unique product ID, instead of breaking it out by Service Line. This is because the Service Line column is being inserted into the power pivot table from table 1, it breaks out sales by Service Line. If you insert the Service Line data column into the power pivot from table 2, COGS would be broken out by Service Line, and only an aggregate for sales would be produced. Ideally the power pivot would display sales AND COGS broken out by Service Line (and Product ID).
Is there a best practice for building relationships (unique ID's) between data tables:
- Without reducing the richness / variety of columns and data in your original tables (when creating the unique ID)? Maybe 'group by' function in power query to obtain unique ID's before building the relationship?
- With retaining the ability to slice and dice the data by different characteristics (assuming they are present in both source tables) and have that detail presented in the power pivot.
Thanks for the help!
The humble Apprentice
4 Replies
- ApprenticeCopper Contributor
Thank you for this I'll start learning some DAX and this virtual relationship connection.
I am finding though that using that formula adds that total COGS amount ($7,500) to each Service Line row of the sales table with the same productID, and then sums it up, so I'm actually getting an aggregate of the aggregate now ($7,500 * 5 = $37,500). Is there something wrong with my formula? (couldn't find 'TREATAS' so used 'INTERSECT')
CALCULATE(SUM(COGS[COGS]),INTERSECT(
ALL(Sales[Service Line]),
VALUES(COGS[Service Line])
)
)
I think it may have to do with the fact the sum has to be on 2 categories / criteria (Product ID and Service Line), whereas right now it's aggregating the total COGS per ProductID, and adding that to each service line (and then summing that up).
- SergeiBaklanDiamond Contributor
In general it's not necessary to build bridge table to handle many-to-many relationships, as variant you may use virtual relationships. based on your sample let assume we have two tables
For them we to calculate COGS we may use measure
COGSv:=CALCULATE( SUM(COGS[COGS] ), TREATAS( VALUES( Sales[Service Line] ), COGS[Service Line] ) )which gives
Compare with
Total COGS:=SUM( COGS[COGS] )More details at Physical and Virtual Relationships in DAX - SQLBI and you may google for other sources.
- ApprenticeCopper Contributor
Thank you for the response, I'll look into learning some DAX and virtual relationships.
I'm find with that formula, it's aggregating the aggregate ($7,500 * 5 = $35,000) instead of providing the COGS per service Line per Product ID (as you have shown above under COGSv). I couldn't find TREATAS so used INTERSECT.
I tried creating another formula, which also produces the same result. Is there a formula (other than TREATAS since I can't seem to find/ use it) which can summarize COGS per Product ID and Service Line (rather than summing COGS per Product ID, and applying that to each Service Line row, and then aggregating that - which is what is currently being done).
This is my current formula:
=CALCULATE(
SUM(COGS[COGS]),
INTERSECT(
SELECTCOLUMNS(
SalesTable,"Criteria1",SalesTable[Service Line],"Criteria2",SalesTable[ProductID]
),
SELECTCOLUMNS(
COGSTable,"Criteria1",COGSTable[Service Line],"Criteria2",COGSTable[ProductID]
)
)
)
- SergeiBaklanDiamond Contributor
I don't remember when TREATAS was introduced in Excel, relatively long ago. However, it's still highlighted by red (as not recognized) in formulae. But it works if to enter it. Usually TREATAS has better performance and easier to use, at least for me.