Forum Discussion

WalterM91's avatar
WalterM91
Copper Contributor
Jul 12, 2023
Solved

Duplicate values in a column when another column has duplicated values

Hi, I would need your kind support on a matter regarding the Power Query Editor for an Excel file I am developing.

 

I have a table as follows:

 

 ID    |    Letter   |    Value   

001  |       A        |        5

002  |       B        |        4

003  |       C        |        3

        |       A        |        1

        |       B        |        2

        |       D        |        4

       

The desired output of the Power Query should be:

 

 ID    |    Letter   |    Value   

001  |       A        |        1

002  |       B        |        2

003  |       C        |        3

        |       D        |        4

 

The Power Query should check if there are duplicated values in the field "Letter" (in the above example this is true for the values A and B). In those entries where the field "Letter" has duplicated values and the field ID is empty, the query should assign the same ID as the the one of the corresponding duplicated value (in the above example: Letter "A" corresponds to the ID "001", Letter "B" corresponds to the ID "002"). Finally, the Query should delete duplicates, keeping the entries where the ID has been copied.

I thank you for the time you will spend on this request.

Walter

  • WalterM91 

    As variant

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        names = Table.ColumnNames( Source ),
        #"Changed Type" = Table.TransformColumnTypes(
            Source,
            {
                {"ID", type text},
                {"Letter", type text},
                {"Value", Int64.Type}
            }),
        #"Grouped Rows" = Table.Group(
            #"Changed Type",
            {"Letter"},
            {
                {"Data",
                    each [
                        tblOrder = Table.Sort(_,{{"ID", Order.Ascending}}),
                        fillUp = Table.FillUp(tblOrder,{"ID"}),
                        getRow = Table.FirstN( fillUp, 1 )
                    ][getRow]
                     
                }
            }),
        keepData = Table.SelectColumns(#"Grouped Rows",{"Data"}),
        expandData = Table.ExpandTableColumn(
            keepData,
            "Data",
            names, names)
    in
        expandData

    Please check attached.

  • peiyezhu's avatar
    peiyezhu
    Bronze Contributor

    select * from basic_group_by limit 20;
    select * from basic_group_by group by Letter;

     

    • WalterM91's avatar
      WalterM91
      Copper Contributor

      peiyezhu  thanks for your response. However, I am trying to get a solution in Power Query, not in sql 

  • WalterM91 

    As variant

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        names = Table.ColumnNames( Source ),
        #"Changed Type" = Table.TransformColumnTypes(
            Source,
            {
                {"ID", type text},
                {"Letter", type text},
                {"Value", Int64.Type}
            }),
        #"Grouped Rows" = Table.Group(
            #"Changed Type",
            {"Letter"},
            {
                {"Data",
                    each [
                        tblOrder = Table.Sort(_,{{"ID", Order.Ascending}}),
                        fillUp = Table.FillUp(tblOrder,{"ID"}),
                        getRow = Table.FirstN( fillUp, 1 )
                    ][getRow]
                     
                }
            }),
        keepData = Table.SelectColumns(#"Grouped Rows",{"Data"}),
        expandData = Table.ExpandTableColumn(
            keepData,
            "Data",
            names, names)
    in
        expandData

    Please check attached.

Resources