Forum Discussion

Andrew__K's avatar
Andrew__K
Brass Contributor
Dec 04, 2023
Solved

PowerQuery Remove Rows dependent on value in column.

Labelling software generates a transaction report that represents each label generated, I pull data from this report using PowerQuery. Each row is an individual label or transaction.

 

When an error is made during the labelling process and the operator deletes/reverses that label, the transaction report shows it as a normal transaction with a 1 in a Transaction Type (Trx Type) column, followed by another row with the same information, but has a -1 as the transaction type to represent the deleted label.

 

Is there any way I can remove both these rows in my query? I know I can filter -1 Trx Types, but this leaves the original transaction in place. There is also a serial column for each transaction, these match on the -1 and 1 rows, so could potentially be used? Below is an example of the report.

 

Label Date Product PPN Description Prod Date Serial Number Trx Type
4/12/202399999PRODUCT LABEL2/11/2023700011
4/12/202399999PRODUCT LABEL2/11/2023700021
4/12/202399999PRODUCT LABEL2/11/2023700031
4/12/202399999PRODUCT LABEL2/11/2023700041
4/12/202399999PRODUCT LABEL2/11/2023700051
4/12/202399999PRODUCT LABEL2/11/2023700061
4/12/202399999PRODUCT LABEL2/11/2023700071
4/12/202399999PRODUCT LABEL2/11/2023700081
4/12/202399999PRODUCT LABEL2/11/2023700091
4/12/202399999PRODUCT LABEL2/11/2023700101
4/12/202399999PRODUCT LABEL2/11/2023700111
4/12/202399999PRODUCT LABEL(*Deleted*)2/11/202370011-1
4/12/202399999PRODUCT LABEL2/11/2023700121
4/12/202399999PRODUCT LABEL2/11/2023700131
  • Andrew__K 

    As variant

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        Flagged = Table.SelectRows(Source, each ([Trx Type] = -1)),
        RemoveThem = Table.SelectRows(
            Source,
            each not List.Contains( Flagged[#" Serial Number "], [#" Serial Number "] )
        ),
        DeclareType = Table.TransformColumnTypes(
            RemoveThem,
            {
                  {"Label Date", type date}
                , {" Product PPN ", Int64.Type}
                , {"Description ", type text}
                , {"Prod Date", type date}
                , {" Serial Number ", Int64.Type}
                , {"Trx Type", Int64.Type}
        })
    
    in
        DeclareType
  • Lorenzo's avatar
    Lorenzo
    Silver Contributor

    Andrew__K 

     

    If your records are ordered as you shared a possible optimization when grouping [Serial Number]:

    GroupedSerial = Table.Group(Source, {" Serial Number "},
        {
            {"COUNT", each Table.RowCount(_), Int64.Type},
            {"DATA", each _, type table}
        },
        GroupKind.Local	// Possible optimization
    ),

     

    Interesting when you have thousandsss of records otherwise you'll likely not see the difference
    (GroupKind.Type)

    • Andrew__K's avatar
      Andrew__K
      Brass Contributor
      Thank you! with your suggestion, I was able to find a solution.

      I ended up creating a helper merged column with PPN-DATE-SERIAL, then grouped by this column and filtered out any rows >1. Then I just expanded the All-rows column and removed the helper column. Thank you!
      • Lorenzo's avatar
        Lorenzo
        Silver Contributor

        Andrew__K 

        Expanding was another option I did not mention (doable with the UI only). Anyway, glad you have it working and Thanks for providing feedback

  • Riny_van_Eekelen's avatar
    Riny_van_Eekelen
    Platinum Contributor

    Andrew__K I would Group (Count Rows) by Serial Number. Keep only serial numbers that have a count of 1. Then merge that table with the table from the Source step (RightOuter) and clean up a bit.

     

    See attached file.

     

Resources