Jun 12 2023 04:37 AM
Hey everyone, first-time poster here!
I am having some issues with a Query refresh that we have to do on a weekly basis. It basically goes into a Sharepoint folder, opens and merges all of the files in there, and then does a bunch of transformations so we can use the data to report our performance metrics.
The refresh takes ~15mins and only gets longer after each week. Any help that could make this faster would be appreciated!
let
Source = Sharepoint_Weekly,
#"Invoke Custom Function1" = Table.AddColumn(Source, "Transform File (2)", each #"Transform File (2)"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Transform File (2)"}),
#"Expanded Transform File (2)1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (2)", {"ORDEN", "CODSUC", "CODMAQ", "NUMSER", "CODUSU"}, {"Datetime", "Product", "Press", "Serial #", "User ID"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Transform File (2)1", each ([Product] <> null)),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows1", {{"Datetime", type datetime}, {"Product", type text}, {"Press", type text}, {"Serial #", Int64.Type}, {"User ID", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(Table.Buffer(#"Changed Type"), {"Serial #"}, Table.Buffer(CAPScrapWeekly), {"Serial #"}, "CAPScrapWeekly", JoinKind.LeftOuter),
#"Expanded CAPScrapWeekly" = Table.ExpandTableColumn(#"Merged Queries", "CAPScrapWeekly", {"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"}),
#"Added Custom6" = Table.AddColumn(#"Expanded CAPScrapWeekly", "Prod Date", each if DateTime.Time([Datetime]) < Time.From("07:00") then DateTime.Date(Date.AddDays([Datetime],-1)) else DateTime.Date([Datetime]), type date),
#"Added Custom" = Table.AddColumn(#"Added Custom6", "WeekNum", each Text.PadStart(Text.From(Date.WeekOfYear([Prod Date])),2,"0"), type text),
#"Replaced Value" = Table.ReplaceValue(#"Added Custom",null,"-",Replacer.ReplaceValue,{"CQ"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",each [Prod Date],each if [CQ]<>"-" then [CAPScrapWeekly.Prod Date] else [Prod Date],Replacer.ReplaceValue,{"Prod Date"}),
#"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value1",{{"Prod Date", type date}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type4",{"CAPScrapWeekly.Prod Date"}),
#"Added Custom3" = Table.AddColumn(#"Removed Columns1", "Month", each Date.MonthName([Prod Date])),
#"Added Custom7" = Table.AddColumn(#"Added Custom3", "Date-Shift", each if DateTime.Time([Datetime])>=Time.From("7:00") and DateTime.Time([Datetime])<Time.From("19:00") then Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "D" else Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "N")
in
#"Added Custom7"
Jun 12 2023 10:16 PM - edited Jun 19 2023 06:31 AM
Hi @JCouture17
Logically, if time to refresh increases every week this means you have at least one table that grows every week... Generic recommendation: get rid of the rows & columns you don't need as early as possible (cf. Curt H. keep things in "table space")
If you Buffer your tables presumably you already tested perf. with and w/o buffering. Otherwise be aware that buffering large tables can be counterproductive (cf. Curt H. in this post)
As a side note, very tiny possible optimization. You can replace:
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",
each [Prod Date],
each if [CQ] <> "-" then [CAPScrapWeekly.Prod Date] else [Prod Date],
Replacer.ReplaceValue,{"Prod Date"}
),
#"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value1",{{"Prod Date", type date}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type4",{"CAPScrapWeekly.Prod Date"}),
with
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",
each [CAPScrapWeekly.Prod Date],
each [CQ],
(a,b,c) as date => if c <> "-" then b else a,
{"Prod Date"}
),
#"Removed Columns1" = Table.RemoveColumns(#"Replaced Value1",{"CAPScrapWeekly.Prod Date"}),
Jun 12 2023 10:30 PM
Jun 13 2023 01:54 AM
Context differs so don't draw conclusions from what's below. It's just to give you a sense of some refresh times
PC Specs: Core i7 / SSD / 8Gb / Win11 / MSFT 365-64Bits
Created following 2 Tables in an Excel workbook, each 1 Million rows. Required columns only. All [Serial #] match:
Query1 - Merge tables + Expand
let
Source = Table.NestedJoin(
SharePoint, {"Serial #"},
CAPScrapWeekly, {"Serial #"},
"CAPScrapWeekly", JoinKind.LeftOuter
),
Expanded = Table.ExpandTableColumn(Source, "CAPScrapWeekly",
{"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"}
)
in
Expanded
Query2 - Same as above + your extra steps
let
Source = Table.NestedJoin(
SharePoint, {"Serial #"},
CAPScrapWeekly, {"Serial #"},
"CAPScrapWeekly", JoinKind.LeftOuter
),
Expanded = Table.ExpandTableColumn(Source, "CAPScrapWeekly",
{"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"}
),
AddedProdDate = Table.AddColumn(Expanded, "Prod Date", each
if DateTime.Time([Datetime]) < Time.From("07:00")
then DateTime.Date(Date.AddDays([Datetime],-1))
else DateTime.Date([Datetime]),
type date
),
AddedWeekNum = Table.AddColumn(AddedProdDate, "WeekNum", each
Text.PadStart(Text.From(Date.WeekOfYear([Prod Date])),2,"0"),
type text
),
ReplacedNullCQ = Table.ReplaceValue(AddedWeekNum,null,"-",Replacer.ReplaceValue,{"CQ"}),
ReplacedProdDate = Table.ReplaceValue(ReplacedNullCQ,
each [CAPScrapWeekly.Prod Date],
each [CQ],
(a,b,c) as date => if c <> "-" then b else a,
{"Prod Date"}
),
RemovedCapsProdDate = Table.RemoveColumns(ReplacedProdDate,{"CAPScrapWeekly.Prod Date"}),
AddedMonth = Table.AddColumn(RemovedCapsProdDate, "Month", each Date.MonthName([Prod Date]), type text),
AddedDateShift = Table.AddColumn(AddedMonth, "Date-Shift", each
if DateTime.Time([Datetime])>=Time.From("7:00") and DateTime.Time([Datetime])<Time.From("19:00")
then Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "D"
else Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "N",
type text
)
in
AddedDateShift
Hope this helps a bit