Forum Discussion

Zdenek_Moravec's avatar
Zdenek_Moravec
Brass Contributor
Mar 10, 2021
Solved

Get record with latest date by criteria from excel table and handle blanks

Dear colleagues,

I maintain a Covid test record list - name, result, date

NameTest ResultTest date
Alannegative01.01.2021
Brednegative01.01.2021
Alanpositive01.02.2021
Cindypositive01.01.2021
Brednegative02.03.2021
Alannegative01.03.2021
Cindynot tested 
Bredpositive05.03.2021

The task is to extract an unique list of employees with their latest covid test.

The problem is, that the database contains records with "not tested" entries (the person already had Covid) and these records should be considered in the statistic.

I tried MAXIFS Date or Pivot Table, but the result is not 100%.

Power Query or helper column is eventually possible.

See attachment with details.

Thank You for a hint.

Zdenek Moravec

Cesky Krumlov, Czech Republic

  • SergeiBaklan's avatar
    SergeiBaklan
    Mar 11, 2021

    Zdenek_Moravec 

    The purpose of Index column is to fix the table in memory. Alternatively you may wrap it by Table.Buffer(). The only small difference you may manipulate with Index from UI only, Table.Buffer() shall be added manually in editor. But result is the same.

     

    Without fixing in memory Table.Distinct could give wrong result (could not) due to lazy evaluation performed by Power Query.

18 Replies

  • Zdenek_Moravec 

    If you wanted a formula it is a question of having the right version of Excel.

    = UNIQUE(T_Data[Name])
    = XLOOKUP(distinct#,T_Data[Name],T_Data[Test Result],,,-1)
    = XLOOKUP(distinct#,T_Data[Name],T_Data[Test date],,,-1)

    gives a solution in 3 parts.  You could also return the entire table with one formula

    = LET(
      distinct, UNIQUE(T_Data[Name]),
      r, XMATCH(distinct, T_Data[Name],,-1),
      INDEX(T_Data, r, {1,2,3}) )

    Sorting can be introduced but it is not needed in this case.

    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      PeterBartholomew1 

      Please accept my apologies for causing confusion.  I had implemented a version of the formula that would accommodate entries out of date sequence but, wrongly, came to the conclusion that the situation was not likely to occur.  Had I realised you were in a position to try the formula, including the workbook with its additional formula would have been helpful.

      = LET(
        distinct, UNIQUE(T_Data[Name]),
        sorted, SORT(T_Data, 3),
        r, XMATCH(distinct, INDEX(sorted, ,1),,-1),
        INDEX(sorted, r, {1,2,3}) )

       

      The n/a is achieved by number formatting and the blank test date will always appear as the last date.

      • JackTradeOne's avatar
        JackTradeOne
        Copper Contributor

        Sorry to butt in (I don't mean to hijack the thread), but I have a clarification request:
        I noticed that the return value of the LET function is

         

         

        INDEX(sorted, r, {1,2,3}) 

         

         


        I imagine that the

        {1,2,3}

        is intended as an array covering all 3 columns in "sorted" as the [column_num] parameter of INDEX. Is there a way to achieve the same thing when the number of columns in the array is not known in advance or may dynamically change?

    • Zdenek_Moravec's avatar
      Zdenek_Moravec
      Brass Contributor

      HelloPeterBartholomew1 

      I am learning the LET formular on this example. I roughly understand the logic.

      The target of this post is to find not the LAST row of each name, but the LATEST row of each name (means the row with max test date of each person). It doesn't need to be on the bottom of the table.

      Would You implement the MAXIFS formula into the LET?

      (a helper column can contain 31.12.2099 for the "not tested" records)

      Thank You

      Zdenek

      • SergeiBaklan's avatar
        SergeiBaklan
        Diamond Contributor

        Zdenek_Moravec 

        Split it on parts to make easier in maintenance

        = LET(
           noDate,              99999,
           namesWithNoDate,     COUNTIFS(T_Data[Name],T_Data[Name], T_Data[Test date],""),
           namesActualMaxDates, MAXIFS(T_Data[Test date],T_Data[Name],T_Data[Name]),
           datesWithNoDate,     IF(T_Data[Test date]="",noDate,T_Data[Test date]),
           addFakeDate,         IF( namesWithNoDate, noDate, namesActualMaxDates ),
           conditionMaxDate,    addFakeDate=datesWithNoDate,
           filterOnMaxDate,     FILTER(T_Data,conditionMaxDate ),
           IF(filterOnMaxDate=0,"",filterOnMaxDate)
        )
    • Zdenek_Moravec's avatar
      Zdenek_Moravec
      Brass Contributor

      Hello PeterBartholomew1

      I love Power Query, but I see, it is time to learn again. We have M365 business and I see all the formulas. I got a new task to create a summary for my manager, so this time I can try to use LET instead of Power Query.

      Can You please send me an attachment? When I copy the formula, I get errors

      But the formula returns a result, only different:

      Alan
      positive
      0

      Thank you

      Zdenek

  • SergeiBaklan's avatar
    SergeiBaklan
    Diamond Contributor

    Zdenek_Moravec 

    Zdenec, if Power Query is an option that could be

    let
        Source = Excel.CurrentWorkbook(){[Name="T_Data"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(
            Source,
            {{"Test date", type date}}
        ),
        #"Fake date" = Table.ReplaceValue(
            #"Changed Type",
            null,#date(9000, 12, 31),
            Replacer.ReplaceValue,{"Test date"}
        ),
        #"Sorted Rows" = Table.Sort(
            #"Fake date",
            {{"Name", Order.Ascending}, {"Test date", Order.Descending}}
        ),
        #"Fix in memory" = Table.AddIndexColumn(
            #"Sorted Rows",
            "Index", 0, 1, Int64.Type
        ),
        #"Keep distinct names" = Table.Distinct(#"Fix in memory", {"Name"}),
        #"Remove fake date" = Table.ReplaceValue(
            #"Keep distinct names",
            #date(9000, 12, 31),
            null,Replacer.ReplaceValue,{"Test date"}
        ),
        #"Remove Index" = Table.RemoveColumns(#"Remove fake date",{"Index"})
    in
        #"Remove Index"

    The idea is to temporary replace no date for not tested by far away date.

     

    • Zdenek_Moravec's avatar
      Zdenek_Moravec
      Brass Contributor
      Hello Sergei
      Sort by name ascending, by date descending and remove duplicates - so easy is it!
      The purpose of the Index column is to help Table.Distinct function to keep the first row? (I guess, because if I delete the index column from the process, the Table.Distinct function keeps the last row of each name ...). So how does the Table.Distinct work with the Index column?
      Thank You
      Zdenek
      • SergeiBaklan's avatar
        SergeiBaklan
        Diamond Contributor

        Zdenek_Moravec 

        The purpose of Index column is to fix the table in memory. Alternatively you may wrap it by Table.Buffer(). The only small difference you may manipulate with Index from UI only, Table.Buffer() shall be added manually in editor. But result is the same.

         

        Without fixing in memory Table.Distinct could give wrong result (could not) due to lazy evaluation performed by Power Query.

Resources