Forum Discussion

JCouture17's avatar
JCouture17
Copper Contributor
Jun 12, 2023

Very slow Power Query refresh

Hey everyone, first-time poster here!

I am having some issues with a Query refresh that we have to do on a weekly basis. It basically goes into a Sharepoint folder, opens and merges all of the files in there, and then does a bunch of transformations so we can use the data to report our performance metrics.

The refresh takes ~15mins and only gets longer after each week. Any help that could make this faster would be appreciated!

 

let
Source = Sharepoint_Weekly,
#"Invoke Custom Function1" = Table.AddColumn(Source, "Transform File (2)", each #"Transform File (2)"([Content])),

#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Transform File (2)"}),
#"Expanded Transform File (2)1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (2)", {"ORDEN", "CODSUC", "CODMAQ", "NUMSER", "CODUSU"}, {"Datetime", "Product", "Press", "Serial #", "User ID"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Transform File (2)1", each ([Product] <> null)),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows1", {{"Datetime", type datetime}, {"Product", type text}, {"Press", type text}, {"Serial #", Int64.Type}, {"User ID", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(Table.Buffer(#"Changed Type"), {"Serial #"}, Table.Buffer(CAPScrapWeekly), {"Serial #"}, "CAPScrapWeekly", JoinKind.LeftOuter),
#"Expanded CAPScrapWeekly" = Table.ExpandTableColumn(#"Merged Queries", "CAPScrapWeekly", {"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"}),
#"Added Custom6" = Table.AddColumn(#"Expanded CAPScrapWeekly", "Prod Date", each if DateTime.Time([Datetime]) < Time.From("07:00") then DateTime.Date(Date.AddDays([Datetime],-1)) else DateTime.Date([Datetime]), type date),
#"Added Custom" = Table.AddColumn(#"Added Custom6", "WeekNum", each Text.PadStart(Text.From(Date.WeekOfYear([Prod Date])),2,"0"), type text),
#"Replaced Value" = Table.ReplaceValue(#"Added Custom",null,"-",Replacer.ReplaceValue,{"CQ"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",each [Prod Date],each if [CQ]<>"-" then [CAPScrapWeekly.Prod Date] else [Prod Date],Replacer.ReplaceValue,{"Prod Date"}),

#"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value1",{{"Prod Date", type date}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type4",{"CAPScrapWeekly.Prod Date"}),
#"Added Custom3" = Table.AddColumn(#"Removed Columns1", "Month", each Date.MonthName([Prod Date])),

#"Added Custom7" = Table.AddColumn(#"Added Custom3", "Date-Shift", each if DateTime.Time([Datetime])>=Time.From("7:00") and DateTime.Time([Datetime])<Time.From("19:00") then Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "D" else Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "N")
in
#"Added Custom7"

 

  • Lorenzo's avatar
    Lorenzo
    Silver Contributor

    JCouture17 

     

    Context differs so don't draw conclusions from what's below. It's just to give you a sense of some refresh times

    PC Specs: Core i7 / SSD / 8Gb / Win11 / MSFT 365-64Bits

     

    Created following 2 Tables in an Excel workbook, each 1 Million rows. Required columns only. All [Serial #] match:

     

    Query1 - Merge tables + Expand

    let
        Source = Table.NestedJoin(
            SharePoint, {"Serial #"},
            CAPScrapWeekly, {"Serial #"},
            "CAPScrapWeekly", JoinKind.LeftOuter
        ),
        Expanded = Table.ExpandTableColumn(Source, "CAPScrapWeekly",
            {"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"}
        )
    in
        Expanded

     

    Query2 - Same as above + your extra steps

    let
        Source = Table.NestedJoin(
            SharePoint, {"Serial #"},
            CAPScrapWeekly, {"Serial #"},
            "CAPScrapWeekly", JoinKind.LeftOuter
        ),
        Expanded = Table.ExpandTableColumn(Source, "CAPScrapWeekly",
            {"CQ", "Prod Date"}, {"CQ", "CAPScrapWeekly.Prod Date"}
        ),
        AddedProdDate = Table.AddColumn(Expanded, "Prod Date", each
            if DateTime.Time([Datetime]) < Time.From("07:00")
            then DateTime.Date(Date.AddDays([Datetime],-1))
            else DateTime.Date([Datetime]),
            type date
        ),
        AddedWeekNum = Table.AddColumn(AddedProdDate, "WeekNum", each
            Text.PadStart(Text.From(Date.WeekOfYear([Prod Date])),2,"0"),
            type text
        ),
        ReplacedNullCQ = Table.ReplaceValue(AddedWeekNum,null,"-",Replacer.ReplaceValue,{"CQ"}),
        ReplacedProdDate = Table.ReplaceValue(ReplacedNullCQ,
            each [CAPScrapWeekly.Prod Date],
            each [CQ],
            (a,b,c) as date => if c <> "-" then b else a,
            {"Prod Date"}
        ),
        RemovedCapsProdDate = Table.RemoveColumns(ReplacedProdDate,{"CAPScrapWeekly.Prod Date"}),
        AddedMonth = Table.AddColumn(RemovedCapsProdDate, "Month", each Date.MonthName([Prod Date]), type text),
        AddedDateShift = Table.AddColumn(AddedMonth, "Date-Shift", each
            if DateTime.Time([Datetime])>=Time.From("7:00") and DateTime.Time([Datetime])<Time.From("19:00")
            then Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "D"
            else Date.ToText([Prod Date],"yyyy-MM-dd") & "-" & "N",
            type text
        )
    in
        AddedDateShift

     

    Hope this helps a bit

  • peiyezhu's avatar
    peiyezhu
    Bronze Contributor
    basically goes into a Sharepoint folder,

    How about download sheets first and then run consolidate all sheets?

    I guess the network may.delay the process.
  • Lorenzo's avatar
    Lorenzo
    Silver Contributor

    Hi JCouture17 

     

    Logically, if time to refresh increases every week this means you have at least one table that grows every week... Generic recommendation: get rid of the rows & columns you don't need as early as possible (cf. Curt H. keep things in "table space")
    If you Buffer your tables presumably you already tested perf. with and w/o buffering. Otherwise be aware that buffering large tables can be counterproductive (cf. Curt H. in this post)

     

     

    As a side note, very tiny possible optimization. You can replace:

     

        #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",
            each [Prod Date],
            each if [CQ] <> "-" then [CAPScrapWeekly.Prod Date] else [Prod Date],
            Replacer.ReplaceValue,{"Prod Date"}
        ),
    
        #"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value1",{{"Prod Date", type date}}),
        #"Removed Columns1" = Table.RemoveColumns(#"Changed Type4",{"CAPScrapWeekly.Prod Date"}),

     

    with

     

        #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",
            each [CAPScrapWeekly.Prod Date],
            each [CQ],
            (a,b,c) as date => if c <> "-" then b else a,
            {"Prod Date"}
        ),
        #"Removed Columns1" = Table.RemoveColumns(#"Replaced Value1",{"CAPScrapWeekly.Prod Date"}),

     

Resources