Forum Discussion
JCouture17
Jun 12, 2023Copper Contributor
Very slow Power Query refresh
Hey everyone, first-time poster here!
I am having some issues with a Query refresh that we have to do on a weekly basis. It basically goes into a Sharepoint folder, opens and merges all of the files in there, and then does a bunch of transformations so we can use the data to report our performance metrics.
The refresh takes ~15mins and only gets longer after each week. Any help that could make this faster would be appreciated!
let
Source = Sharepoint_Weekly,
#"Invoke Custom Function1" = Table.AddColumn(Source, "Transform File (2)", each #"Transform File (2)"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Transform File (2)"}),
#"Expanded Transform File (2)1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (2)", {"ORDEN", "CODSUC", "CODMAQ", "NUMSER", "CODUSU"}, {"Datetime", "Product", "Press", "Serial #", "User ID"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Transform File (2)1", each ([Product] <> null)),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows1", {{"Datetime", type datetime}, {"Product", type text}, {"Press", type text}, {"Serial #", Int64.Type}, {"User ID", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(Table.Buffer(#"Changed Type"), {"Serial #"}, Table.Buffer(CAPScrapWeekly), {"Serial #"}, "CAPScrapWeekly", JoinKind.LeftOuter),
#"Expanded CAPScrapWeekly" = Table.ExpandTableColumn(#"Merged Queries", "CAPScrapWeekly", {"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"}),
#"Added Custom6" = Table.AddColumn(#"Expanded CAPScrapWeekly", "Prod Date", each if DateTime.Time([Datetime]) < Time.From("07:00") then DateTime.Date(Date.AddDays([Datetime],-1)) else DateTime.Date([Datetime]), type date),
#"Added Custom" = Table.AddColumn(#"Added Custom6", "WeekNum", each Text.PadStart(Text.From(Date.WeekOfYear([Prod Date])),2,"0"), type text),
#"Replaced Value" = Table.ReplaceValue(#"Added Custom",null,"-",Replacer.ReplaceValue,{"CQ"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",each [Prod Date],each if [CQ]<>"-" then [CAPScrapWeekly.Prod Date] else [Prod Date],Replacer.ReplaceValue,{"Prod Date"}),
#"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value1",{{"Prod Date", type date}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type4",{"CAPScrapWeekly.Prod Date"}),
#"Added Custom3" = Table.AddColumn(#"Removed Columns1", "Month", each Date.MonthName([Prod Date])),
#"Added Custom7" = Table.AddColumn(#"Added Custom3", "Date-Shift", each if DateTime.Time([Datetime])>=Time.From("7:00") and DateTime.Time([Datetime])<Time.From("19:00") then Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "D" else Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "N")
in
#"Added Custom7"
- LorenzoSilver Contributor
Context differs so don't draw conclusions from what's below. It's just to give you a sense of some refresh times
PC Specs: Core i7 / SSD / 8Gb / Win11 / MSFT 365-64Bits
Created following 2 Tables in an Excel workbook, each 1 Million rows. Required columns only. All [Serial #] match:
Query1 - Merge tables + Expand
let Source = Table.NestedJoin( SharePoint, {"Serial #"}, CAPScrapWeekly, {"Serial #"}, "CAPScrapWeekly", JoinKind.LeftOuter ), Expanded = Table.ExpandTableColumn(Source, "CAPScrapWeekly", {"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"} ) in Expanded
Query2 - Same as above + your extra steps
let Source = Table.NestedJoin( SharePoint, {"Serial #"}, CAPScrapWeekly, {"Serial #"}, "CAPScrapWeekly", JoinKind.LeftOuter ), Expanded = Table.ExpandTableColumn(Source, "CAPScrapWeekly", {"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"} ), AddedProdDate = Table.AddColumn(Expanded, "Prod Date", each if DateTime.Time([Datetime]) < Time.From("07:00") then DateTime.Date(Date.AddDays([Datetime],-1)) else DateTime.Date([Datetime]), type date ), AddedWeekNum = Table.AddColumn(AddedProdDate, "WeekNum", each Text.PadStart(Text.From(Date.WeekOfYear([Prod Date])),2,"0"), type text ), ReplacedNullCQ = Table.ReplaceValue(AddedWeekNum,null,"-",Replacer.ReplaceValue,{"CQ"}), ReplacedProdDate = Table.ReplaceValue(ReplacedNullCQ, each [CAPScrapWeekly.Prod Date], each [CQ], (a,b,c) as date => if c <> "-" then b else a, {"Prod Date"} ), RemovedCapsProdDate = Table.RemoveColumns(ReplacedProdDate,{"CAPScrapWeekly.Prod Date"}), AddedMonth = Table.AddColumn(RemovedCapsProdDate, "Month", each Date.MonthName([Prod Date]), type text), AddedDateShift = Table.AddColumn(AddedMonth, "Date-Shift", each if DateTime.Time([Datetime])>=Time.From("7:00") and DateTime.Time([Datetime])<Time.From("19:00") then Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "D" else Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "N", type text ) in AddedDateShift
Hope this helps a bit
- peiyezhuBronze Contributorbasically goes into a Sharepoint folder,
How about download sheets first and then run consolidate all sheets?
I guess the network may.delay the process. - LorenzoSilver Contributor
Hi JCouture17
Logically, if time to refresh increases every week this means you have at least one table that grows every week... Generic recommendation: get rid of the rows & columns you don't need as early as possible (cf. Curt H. keep things in "table space")
If you Buffer your tables presumably you already tested perf. with and w/o buffering. Otherwise be aware that buffering large tables can be counterproductive (cf. Curt H. in this post)- Optimising The Performance Of Power Query Merges In Power BI, Part 1: Removing Columns
- Optimising The Performance Of Power Query Merges In Power BI, Part 2: Does When You Remove Columns Matter?
- Optimising The Performance Of Power Query Merges In Power BI, Part 3: Table.Join And SortMerge
- Optimising The Performance Of Power Query Merges In Power BI, Part 4: Table.Join And Other Join Algorithms
- Speed/Performance aspects
As a side note, very tiny possible optimization. You can replace:
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value", each [Prod Date], each if [CQ] <> "-" then [CAPScrapWeekly.Prod Date] else [Prod Date], Replacer.ReplaceValue,{"Prod Date"} ), #"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value1",{{"Prod Date", type date}}), #"Removed Columns1" = Table.RemoveColumns(#"Changed Type4",{"CAPScrapWeekly.Prod Date"}),
with
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value", each [CAPScrapWeekly.Prod Date], each [CQ], (a,b,c) as date => if c <> "-" then b else a, {"Prod Date"} ), #"Removed Columns1" = Table.RemoveColumns(#"Replaced Value1",{"CAPScrapWeekly.Prod Date"}),