Forum Discussion

LC_2020's avatar
LC_2020
Copper Contributor
Jan 16, 2020

Combining data from multiple rows into a single row taking earliest date

I have a data set that looks like this: 

 

IDNameDriver's LiscenceSubmission DateApproval Date
341Kenner, Jen322149/16/199/15/19
318Kenner, Jen322149/18/199/19/19
302Kenner, Jen322149/19/19 

 

I want to combine the rows with the same name despite different IDs and take the earliest submission date (i.e. 9/16/19) and the oldest approval date (i.e. 9/19/19) and have them appear as one row (condensing the info). So maybe it would look something like this: 

 

ID_newNameDriver's LiscenceSubmission Date_newApproval Date_new
341318302(?, or generating a whole new ID #)Kenner, Jen322149/16/199/19/19

 

Any help is greatly appreciated!! 

 

2 Replies

  • SergeiBaklan's avatar
    SergeiBaklan
    Diamond Contributor

    LC_2020 

    For such model

    If with formulas:

    Select unique names

    =IFERROR(INDEX($C$4:$C$7,AGGREGATE(15,6,1/(COUNTIF($I$3:$I3,$C$4:$C$7)=0)*(ROW($C$4:$C$7)-ROW($C$3)),1)),"")

    Driver license

    =IFNA(INDEX($D$4:$D$7,MATCH($I4,$C$4:$C$7,0)),"")

    Submission date (earliest in list)

    =IFERROR(AGGREGATE(15,6,1/($C$4:$C$7=$I4)*$E$4:$E$7,1),"")

    Approval date (latest one)

    =IFERROR(AGGREGATE(14,6,1/($C$4:$C$7=$I4)*$F$4:$F$7,1),"")

    Combined ID:

    =TEXTJOIN("",TRUE,IF($C$4:$C$7=$I4,$B$4:$B$7,""))

    and drag all down till empty cells appear.

     

    If with Power Query:

    Query the list, Group by Names with proper aggregations on other columns, return back adding some cosmetic:

    let
        Source = Excel.CurrentWorkbook(),
        SourceList = Source{[Name="SourceList"]}[Content],
        PromotHeaders = Table.PromoteHeaders(
            SourceList,
            [PromoteAllScalars=true]
        ),
        FilterNulls = Table.SelectRows(
            PromotHeaders,
            each ([Name] <> null)
        ),
        ChangeType = Table.TransformColumnTypes(
            FilterNulls,
            {
                {"ID", type text},
                {"Name", type text},
                {"Driver's Liscence", type text},
                {"Submission Date", type date},
                {"Approval Date", type date}
            }
        ),
        GroupNames = Table.Group(
            ChangeType,
            {"Name"},
            {
                {"ID", each Text.Combine([ID]), type text},
                {"Driver's Liscence", each List.First([#"Driver's Liscence"]), type text},
                {"Submission Date", each List.Min([Submission Date]), type date},
                {"Approval Date", each List.Max([Approval Date]), type date}
            }
        ),
        ReorderColumns = Table.ReorderColumns(
            GroupNames,
            {"ID", "Name", "Submission Date", "Approval Date"}
        )
    in
        ReorderColumns
  • mathetes's avatar
    mathetes
    Gold Contributor

    LC_2020 

     

    Could you upload an abbreviated version of your actual spreadsheet, so we don't need to recreate it?

     

    Give us also an idea of the scope of your data: how many rows, how many duplicate names, typically?

     

    An observation: one of the general rules of good database design is to never make "name" a field that includes first and last; rather, make each name a field of its own. It's too late for that now, but if you can at the same time break that field into two (or more, if you've got suffixes or prefixes) then do it. It's far easier to assemble two names into one--for printing output--than it is to break them apart.

Resources