Forum Discussion

Jpalaci1's avatar
Jpalaci1
Brass Contributor
Sep 29, 2021
Solved

Power Query Append Repeatability Issue

I have a raw data source that has 350K rows. Each month around 25K-30K new lines are added. I have two tables of an old mapping and new (tables in G:H and J:K). I must show the result in a pivot (M:Q) and have my results but not sure how to make repeatable. 

 

Here's my process I hope to make simpler and repeatable:

1) Two different workbooks I dump the raw data. Each workbook has either old or new mapping.

2) New workbook: I use Power Query to import the data from both workbooks then append the two together and output into a table.

3) Second new workbook: Use Power Query and create connection only to the workbook in step 2 then output my desired results in the Pivot.

 

My questions are: 

1) Am I doing this right? Can I need to be able to do the analysis I did in my example but not sure if I can do the same in one workbook rather than two for the old vs new lookups.

2) Can my process be improved? Can this be improved using Access rather than Excel? I need to improve processing as to setup (steps 1-3) took 1 hour alone for Power Query to process and then to refresh twice a month. 

 

2 Replies

  • SergeiBaklan's avatar
    SergeiBaklan
    Diamond Contributor

    Jpalaci1 

    I'd avoid using of extra intermediate workbook. You may do everything in another workbook - combine two sources by Power Query, load result in data model (not to the table in the grid) and build PivotTable  based on data model. Alternative land Power Query directly in PivotTable.

     

    Sure, Access and even MS SQL Server with related tools are more powerful, but before shifting on them check again if you did all performance optimizations. 

    Chris Webb has series of posts around that Chris Webb's BI Blog: Optimising The Performance Of Power Query Merges In Power BI, Part 3: Table.Join And SortMerge Chris Webb's BI Blog (crossjoin.co.uk), you may find other advices. 

    • Jpalaci1's avatar
      Jpalaci1
      Brass Contributor
      Thank you so much. As a temporary solution I dumped all completed data into a Access database and noticed my refresh time dropped dramatically. Refresh is only 49 seconds compared to the 5 1/2 to 6 minutes refresh via Excel.

      I'll read into this link you provide. Thank you!

Resources