SOLVED

Power Query List

Copper Contributor

Hi all, could anyone please help with M language to show how to calculate project allocation dynamically when number of projects can be changed in the example provided, e.g. Project 1 allocation = Cost * Project 1 and so on.

DON_DCS_0-1688610271159.png

Thank you!

16 Replies
best response confirmed by HansVogelaar (MVP)
Solution

@DON_DCS Perhaps the attached example helps you on your way. It's a rather simplistic, step by step approach that probably can be made more effective, but then it would most likely be more difficult to follow.

 

The query unpivots the data first. Then it adds the calculation of the allocation for each row and adds a column to hold allocation column names.

Then it pivots only the columns relevant to the allocations and merges it with the original source, based on the project name. Expand the merged column based on the dynamic list of allocation column names created earlier.

 

See if you can follow the step and apply them to your own file. If not come back here.

@Riny_van_Eekelen excellent solution. Thank you very much!

@DON_DCS 

To play with code a bit

let
  Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],

  newFieldNames = List.Transform(
    List.Skip( Table.ColumnNames( Source ), 2),
    (n) => n & " Allocation" ),
  addAllocation = Table.AddColumn(
    Source,
    "Custom",
    (r) => Record.FromList(
        List.Transform(
          List.Skip( Record.FieldValues(r), 2),
          (q) => q*Record.FieldValues(r){1}
        ),
        newFieldNames
    ) ),
  expanded = Table.ExpandRecordColumn(
    addAllocation,
    "Custom",
    newFieldNames)
in
  expanded

@SergeiBaklan this is amazing, you've used more functions than steps 🙂

I actually prefer your solution. However, can we make it more dynamic, e.g. with the List.Skip(... ,2) there. The condition here is Cost will be the key column before the table can be extended both sides left or right.

@DON_DCS 

That could be like

let
  Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],

  names = Table.ColumnNames( Source ),
  costPosition = List.PositionOf( names, "Cost" ),
  newFieldNames = List.Transform(
    List.Skip( names, costPosition + 1),
    (n) => n & " Allocation" ),
  addAllocation = Table.AddColumn(
    Source,
    "Custom",
    (r) => Record.FromList(
        List.Transform(
          List.Skip( Record.FieldValues(r), costPosition + 1),
          (q) => q*Record.FieldValues(r){costPosition}
        ),
        newFieldNames
    ) ),
  expanded = Table.ExpandRecordColumn(
    addAllocation,
    "Custom",
    newFieldNames)
in
  expanded

Please see in the second sheet attached.

@SergeiBaklan it's perfect! Thank you so much!

@DON_DCS , you are welcome, glad to help.

@SergeiBaklan I'm thinking if it's possible to use Table.ReplaceValue to override the project columns with the allocation calculation, i.e. eliminate the additional columns creation?

@DON_DCS 

See Table4 in Sheet3:

 

// Table4
let
    Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
    PCols = List.Select(Table.ColumnNames(Source),
        (c)=> Text.StartsWith(c,"P")
    ),
    ChangedTypes = Table.TransformColumnTypes(Source,
        {{"Name", type text}, {"Division", type text}, {"Cost", type number}} &
        List.Transform(PCols, each {_, type number})
    ),
    ReplacedValues = Table.ReplaceValue(ChangedTypes, each [Cost], null,
        (p,c,n) as number => p*c,
        PCols
    )
in
    ReplacedValues
Great solution @Lorenzo - Thank you so much!
The key step here is to change the type of the list to Number
Table.TransformColumnTypes(Source,List.Transform(PCols, each {_, type number})

Glad this helps @DON_DCS 

 

The key step here is to change the type of the list to Number
Table.TransformColumnTypes(Source,List.Transform(PCols, each {_, type number})

This doesn't change the list items to type Number, this changes the list of column names to type Number. And - IMHO - the other key point is we maintain the column type during the replacement with:

(p,c,n) as number => p*c

Note that this works because the columns were previously typed as number otherwise this would have no effect

 

FYI A couple of months ago the exact same case was raised on MS Answers. After testing various options on a table of 100k rows with 50 columns to update (the OP had > 90), the following approach was +/- 2 x faster than the Table.ReplaceValue approach:

 

// Table5
let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    ChangedTypes = Table.TransformColumnTypes(Source,
        {{"Name", type text}, {"Division", type text}, {"Cost", type number}}
    ),
    ColumnsToUpdate = List.Select(Table.ColumnNames(Source),
        (c)=> Text.StartsWith(c,"P")
    ),
    TypeForColumnsToUpdate = Type.ForRecord(
        List.Accumulate(ColumnsToUpdate, [],
            (x,y) => Record.AddField(x,y, [Type=type nullable number, Optional=false])
        ),
        false
    ),
    AddedTable = Table.AddColumn(ChangedTypes, "TEMP_TABLE",
        (r)=> Table.FromRows( 
            {List.Transform(Record.ToList(Record.SelectFields(r,ColumnsToUpdate)), each _ * r[Cost])},
            ColumnsToUpdate
        ),
        type table TypeForColumnsToUpdate
    ),
    RemovedColumns = Table.RemoveColumns(AddedTable, ColumnsToUpdate), 
    ExpandedTable = Table.ExpandTableColumn(RemovedColumns, "TEMP_TABLE", ColumnsToUpdate)
in
    ExpandedTable

 

This is genius @Lorenzo I've tested adding 12K rows and 200 columns, the Table.AddColumn approach is at least 15 seconds faster.
Still try to understand the TypeForColumnsToUpdate and why did you put a false after that and type table TypeForColumnsToUpdate after Table.AddColumn 🙂

@DON_DCS 

 

Still try to understand the TypeForColumnsToUpdate and why did you put a false
TBH I have no idea why this must be a closed record (open=false)

 

Below is another option:

// Table6
let
    Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
    ColumnsToUpdate = List.Select(Table.ColumnNames(Source),
        (c)=> Text.StartsWith(c,"P")
    ),
    ChangedTypes = Table.TransformColumnTypes(Source,
        {{"Name", type text}, {"Division", type text}, {"Cost", type number}} &
        List.Transform(ColumnsToUpdate, each {_, type number})
    ),
    AddedTable = Table.AddColumn(ChangedTypes, "TEMP_TABLE",
        (r)=> Table.FromRows( 
            {List.Transform(Record.ToList(Record.SelectFields(r,ColumnsToUpdate)), each _ * r[Cost])},
            ColumnsToUpdate
        ),
        type table
    ),
    RemovedColumns = Table.RemoveColumns(AddedTable, ColumnsToUpdate),
    ExpandedTable = Table.ExpandTableColumn(RemovedColumns, "TEMP_TABLE", ColumnsToUpdate),
    RestoredTypes = Value.ReplaceType(ExpandedTable, Value.Type(ChangedTypes))
in
    RestoredTypes

Now, if the output of your query loads to an Excel sheet only you can do it with untyped columns, i.e.:

// Table7
let
    Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
    ColumnsToUpdate = List.Select(Table.ColumnNames(Source),
        (c)=> Text.StartsWith(c,"P")
    ),
    ChangedTypes = Table.TransformColumnTypes(Source,
        {{"Name", type text}, {"Division", type text}, {"Cost", type number}}
    ),
    AddedTable = Table.AddColumn(ChangedTypes, "TEMP_TABLE",
        (r)=> Table.FromRows( 
            {List.Transform(Record.ToList(Record.SelectFields(r,ColumnsToUpdate)), each _ * r[Cost])},
            ColumnsToUpdate
        ),
        type table
    ),
    RemovedColumns = Table.RemoveColumns(AddedTable, ColumnsToUpdate),
    ExpandedTable = Table.ExpandTableColumn(RemovedColumns, "TEMP_TABLE", ColumnsToUpdate)
in
    ExpandedTable

@DON_DCS 

On a different case Table.FromRows was slower than Table.Tranpose(Table.FromColumns()). Tried the following approaches on a table of 10K rows x 200 columns to update:

 

AddedTable = Table.AddColumn(ChangedTypes, "TEMP_TABLE",
    (r)=> Table.Transpose(
            Table.FromList(
                List.Transform(Record.ToList(Record.SelectFields(r,ColumnsToUpdate)), each _ * r[Cost]),
                Splitter.SplitByNothing()
            ),
            ColumnsToUpdate
        ),
    type table
),
AddedTable = Table.AddColumn(ChangedTypes, "TEMP_TABLE",
    (r)=> Table.Transpose(
            Table.FromColumns(
                {List.Transform(Record.ToList(Record.SelectFields(r,ColumnsToUpdate)), each _ * r[Cost])}
            ),
            ColumnsToUpdate
        ),
    type table
),

Always difficult to measure precisely but the above options appeared to be slightly faster. After buffering the list of columns to update the difference was obvious:

 

With Table.FromRows, avg time to refresh: 00:00:10
With above options, avg time to refresh: 00:00:06

This is getting better and better @Lorenzo 

I prefer Transpose from Column, it has the same parameter and not required Splitter.SplitByNothing like Transpose from List :smile:

@DON_DCS 

 

Tried a few other approaches. The following did not appear to be faster with 10K rows. With 50K rows it's 3 to 4 secs (avg.) faster than the other options

 

let
    Source = Excel.CurrentWorkbook(){[Name="SourceTable"]}[Content],
    ColumnsToUpdate = List.Buffer(
        List.Select(Table.ColumnNames(Source), (c)=> Text.StartsWith(c, "P"))
    ),
    ChangedTypes = Table.TransformColumnTypes(Source,
        {{"Name", type text}, {"Division", type text}, {"Cost", type number}} &
        List.Transform(ColumnsToUpdate, each {_,type number})
    ),
    AddedRecord = Table.AddColumn(ChangedTypes, "TEMP_RECORD",
        (r)=> Record.FromList(
            List.Transform(ColumnsToUpdate, (c)=> r[Cost]*Record.Field(r,c)),
            ColumnsToUpdate
        ),
        type record
    ),
    RemovedColumns = Table.RemoveColumns(AddedRecord, ColumnsToUpdate),
    Expanded = Table.ExpandRecordColumn(RemovedColumns, "TEMP_RECORD", ColumnsToUpdate),
    RestoredTypes = Value.ReplaceType(Expanded, Value.Type(ChangedTypes))
in
    RestoredTypes

FYI:
- I disabled Query Option "Allow data previews to download in the background"
- Disabling the PQ Firewall (Query Option "Ignore the Pivacy Levels...") slightly improved the perf.

 

To help people who Search may I ask you to revise the title of this thread so it better reflects what's expected? Thanks

 

 

1 best response

Accepted Solutions
best response confirmed by HansVogelaar (MVP)
Solution

@DON_DCS Perhaps the attached example helps you on your way. It's a rather simplistic, step by step approach that probably can be made more effective, but then it would most likely be more difficult to follow.

 

The query unpivots the data first. Then it adds the calculation of the allocation for each row and adds a column to hold allocation column names.

Then it pivots only the columns relevant to the allocations and merges it with the original source, based on the project name. Expand the merged column based on the dynamic list of allocation column names created earlier.

 

See if you can follow the step and apply them to your own file. If not come back here.

View solution in original post