Sep 23 2024 06:19 AM
Hi Team,
I have two tables as below:
The fact table has date range in a imbalanced way as below:
For example, row 1 and 2 contains full days in a month(April & May).
But for row 3 contains only 1st 20 days and row 4 contains next 10 days.
This is how we get the data and we can do nothing about it.
We want to create a relationship from this fact table to calendar table in order to find the average cost per month.
If we find the average as it is from above fact table, it gives average as 200 which is wrong :
Correct one is if we add the 3rd and 4th row values because it counts to full april month(01/04/2023 - 30/04/2023) then it counts to 500. average is 266.67. But data doesn't come as we wish.
As the fact table contains dates in two columns(date range), we first convert this 2 column of dates into a single list of date column in power query(dataflow). Then connect that date column to calendar date and find the average cost per month.
Now what happens is that, as our original data(Fact) contains millions of rows, this operation of converting two column of dates into a single column(each day) causes lot of performance issues. This causes data flow refresh issues, report to slow down etc.
Is there any other ways we can create relationship between calendar table and fact table, without converting range of dates into day granularity in fact table but still pick up the cost for those dates?
Is it possible to create a dax measure that create a relationship between these two tables ?
Please suggest.
FYR, PFA sample file here Test.pbix
Thanks in advance!
Sep 24 2024 12:05 AM
Sep 24 2024 02:16 AM
Hi @Lorenzo
Thanks for your quick response!
May i know how that average was created? Can i see the dax measure please?
Just FYI, Its not only about the average but find out the number of days between 'date from' and 'date to'(in fact table) using a dax measure and divide that by cost to find average.
For example, in screenshot below for rows 3 & 4:
We need a dax to make sure that it calculates number of days between 'date from' and 'date to' of the particular month in the visual.
We need to account for 2 scenarios where:
1)The start and end date are within the particular month or
2)More importantly where the end date is outside of the month in question
So the key thing is finding the number of days. Use that(no of days) to divide cost.
My expected output visual would be something like this:
Please let me know if you need further info!
Thanks in advance!
Sep 24 2024 08:10 AM
SolutionFor such table
measure for number of days could be
Number of Dates =
VAR minDate = MIN( 'Date'[Date] )
VAR maxDate = MAX( 'Date'[Date] )
RETURN
SUMX(
Source,
IF(
Source[Date To] < minDate || Source[Date From] > maxDate,
BLANK( ),
INT( MIN( Source[Date To], maxDate ) - MAX( Source[Date From], minDate ) ) + 1
)
)
which gives
or
No relationships between Date and Source
Didn't catch how you calculate average cost, so ignored that.
Sep 25 2024 05:20 AM
Many thanks for your quick response sir!
Currently we are having a look into it & trying to implement in our model.
We will get back to you on this.
Meanwhile I am curious to learn how did you learn these skills for power bi. Did you read books or watch some channels to enhance your skills?
Is there any books or websites or channels that you can suggest me which i can read/learn to improve my skills in power bi, dax & modeling etc.?
Thanks in advance!
Sep 25 2024 08:44 AM
I'm not a pro in Power BI, mainly practice on tasks I do from time to time. Main source of information for me is sqlbi.com, I started from their books Books - SQLBI and follow articles the are publishing.
Oct 04 2024 05:45 AM
Thanks for your prompt response sir!
This number of dates measure works like a gem after testing!
Elegance with which you wrote works well in terms of performance as well!
We have replaced the source table with our TargetTimeSeries table to calculate number of days between date range(StartDate & EndDate):
Number of Dates =
VAR minDate = MIN( 'Calendar'[Date] )
VAR maxDate = MAX( 'Calendar'[Date] )
RETURN
SUMX(
TargetTimeSeries,
IF(
TargetTimeSeries[EndDate] < minDate || TargetTimeSeries[StartDate] > maxDate,
BLANK( ),
INT( MIN( TargetTimeSeries[EndDate], maxDate ) - MAX( TargetTimeSeries[StartDate], minDate ) ) + 1
)
)
We have usage column in our TargetTimeseries table. Now we have to create dax for Target consumption per day.
We need to divide this usage column with above Number of dates measure to get the daily consumption. so we created a dax measure as below:
Target consumption per day =
VAR minDate = MIN( 'Calendar'[Date] )
VAR maxDate = MAX( 'Calendar'[Date] )
RETURN
SUMX(
TargetTimeSeries,
DIVIDE(TargetTimeSeries[Usage],IF(
TargetTimeSeries[EndDate] < minDate || TargetTimeSeries[StartDate] > maxDate,
BLANK( ),
INT( MIN( TargetTimeSeries[EndDate], maxDate ) - MAX( TargetTimeSeries[StartDate], minDate ) ) + 1
)
)
)
Below is the output in visual:
we did a data analysis with data in model to find whether values returned by measure in visual is accurate.
So we took NetworkRail-3429. For below filter(1st April 2023) on StartDate,EndDate will show 31s March 2024 for this start date, usage column should add up to give 12519260.
Then according to above 'Target consumption per day' dax, dividing this 12519260 with Number of days between startdate & EndDate which is 365. The result should be 12519260/365 = 34299.
This 34299 should be populated for all months of table visual as below:
But we are not where & how below values are calculated in table visual:
We suspect because there is no relationship between Enddate(TargetTimeSeries) and Date(Calendar) & only startdate-Date relation is active this issue occurs. But not sure how to resolve this issue.
The current relationship between these two tables is as below:
Could please advise me on this?
Please let me know if you need further info!
PFA file here TimeSeries Test.pbix
Also, FYR PFA file of filtered Network Rail data NetworkRail3429 data.xlsx
Many thanks in advance!
Oct 07 2024 07:32 AM
Hi @SergeiBaklan ,
I have tried to find the below solution for my question sir!
Target consumption per day = VAR minDate = MIN ( 'Calendar'[Date] ) VAR maxDate = MAX ( 'Calendar'[Date] ) RETURN CALCULATE ( SUMX ( TargetTimeSeries, VAR _Days = TargetTimeSeries[EndDate] - TargetTimeSeries[StartDate] VAR Result = TargetTimeSeries[Usage] / _Days RETURN Result ), REMOVEFILTERS ( 'Calendar' ), KEEPFILTERS ( TargetTimeSeries[EndDate] >= minDate ), KEEPFILTERS ( TargetTimeSeries[StartDate] <= maxDate ) )
I was able to calculate the target consumption on a daily basis. But I am struggling to find the Target consumption per month.
What I mean is, from above measure we are able to calculate 'target consumption per day' which is great!.
with this daily value, we now need to use this daily value to multiply by the number of days in the month of the visual to apportion this consumption correctly.
for example,
ie. if the target table started on 17/04/2023 and ended on 31/03/2024 then the daily value would be multiplied by 14. because there are 14 days in April that this relates to. green color indicates what we already achieved.
Yellow color(Target consumption per month) is what we need to find. FYR, PFA below excel file that contains few formulas for better understanding NetworkRail3429 data.xlsx
Target consumption per month should sum up this 14 days values and display for its respective month
PFA file TimeSeries Test.pbix
Please advise!
Thanks in advance!
@SergeiBaklan
Oct 10 2024 02:23 AM
Hi @SergeiBaklan ,
Many thanks sir!
we have resolved this issue!
I have marked your answer as best response!
Many thanks
Sep 24 2024 08:10 AM
SolutionFor such table
measure for number of days could be
Number of Dates =
VAR minDate = MIN( 'Date'[Date] )
VAR maxDate = MAX( 'Date'[Date] )
RETURN
SUMX(
Source,
IF(
Source[Date To] < minDate || Source[Date From] > maxDate,
BLANK( ),
INT( MIN( Source[Date To], maxDate ) - MAX( Source[Date From], minDate ) ) + 1
)
)
which gives
or
No relationships between Date and Source
Didn't catch how you calculate average cost, so ignored that.