Forum Discussion
Get record with latest date by criteria from excel table and handle blanks
Dear colleagues,
I maintain a Covid test record list - name, result, date
Name | Test Result | Test date |
Alan | negative | 01.01.2021 |
Bred | negative | 01.01.2021 |
Alan | positive | 01.02.2021 |
Cindy | positive | 01.01.2021 |
Bred | negative | 02.03.2021 |
Alan | negative | 01.03.2021 |
Cindy | not tested | |
Bred | positive | 05.03.2021 |
The task is to extract an unique list of employees with their latest covid test.
The problem is, that the database contains records with "not tested" entries (the person already had Covid) and these records should be considered in the statistic.
I tried MAXIFS Date or Pivot Table, but the result is not 100%.
Power Query or helper column is eventually possible.
See attachment with details.
Thank You for a hint.
Zdenek Moravec
Cesky Krumlov, Czech Republic
The purpose of Index column is to fix the table in memory. Alternatively you may wrap it by Table.Buffer(). The only small difference you may manipulate with Index from UI only, Table.Buffer() shall be added manually in editor. But result is the same.
Without fixing in memory Table.Distinct could give wrong result (could not) due to lazy evaluation performed by Power Query.
18 Replies
- PeterBartholomew1Silver Contributor
If you wanted a formula it is a question of having the right version of Excel.
= UNIQUE(T_Data[Name]) = XLOOKUP(distinct#,T_Data[Name],T_Data[Test Result],,,-1) = XLOOKUP(distinct#,T_Data[Name],T_Data[Test date],,,-1)
gives a solution in 3 parts. You could also return the entire table with one formula
= LET( distinct, UNIQUE(T_Data[Name]), r, XMATCH(distinct, T_Data[Name],,-1), INDEX(T_Data, r, {1,2,3}) )
Sorting can be introduced but it is not needed in this case.
- PeterBartholomew1Silver Contributor
Please accept my apologies for causing confusion. I had implemented a version of the formula that would accommodate entries out of date sequence but, wrongly, came to the conclusion that the situation was not likely to occur. Had I realised you were in a position to try the formula, including the workbook with its additional formula would have been helpful.
= LET( distinct, UNIQUE(T_Data[Name]), sorted, SORT(T_Data, 3), r, XMATCH(distinct, INDEX(sorted, ,1),,-1), INDEX(sorted, r, {1,2,3}) )
The n/a is achieved by number formatting and the blank test date will always appear as the last date.
- JackTradeOneCopper Contributor
Sorry to butt in (I don't mean to hijack the thread), but I have a clarification request:
I noticed that the return value of the LET function isINDEX(sorted, r, {1,2,3})
I imagine that the{1,2,3}
is intended as an array covering all 3 columns in "sorted" as the [column_num] parameter of INDEX. Is there a way to achieve the same thing when the number of columns in the array is not known in advance or may dynamically change?
- Zdenek_MoravecBrass Contributor
HelloPeterBartholomew1
I am learning the LET formular on this example. I roughly understand the logic.
The target of this post is to find not the LAST row of each name, but the LATEST row of each name (means the row with max test date of each person). It doesn't need to be on the bottom of the table.
Would You implement the MAXIFS formula into the LET?
(a helper column can contain 31.12.2099 for the "not tested" records)
Thank You
Zdenek
- SergeiBaklanDiamond Contributor
Split it on parts to make easier in maintenance
= LET( noDate, 99999, namesWithNoDate, COUNTIFS(T_Data[Name],T_Data[Name], T_Data[Test date],""), namesActualMaxDates, MAXIFS(T_Data[Test date],T_Data[Name],T_Data[Name]), datesWithNoDate, IF(T_Data[Test date]="",noDate,T_Data[Test date]), addFakeDate, IF( namesWithNoDate, noDate, namesActualMaxDates ), conditionMaxDate, addFakeDate=datesWithNoDate, filterOnMaxDate, FILTER(T_Data,conditionMaxDate ), IF(filterOnMaxDate=0,"",filterOnMaxDate) )
- Zdenek_MoravecBrass Contributor
Hello PeterBartholomew1
I love Power Query, but I see, it is time to learn again. We have M365 business and I see all the formulas. I got a new task to create a summary for my manager, so this time I can try to use LET instead of Power Query.
Can You please send me an attachment? When I copy the formula, I get errors
But the formula returns a result, only different:
Alan positive 0 Thank you
Zdenek
- SergeiBaklanDiamond Contributor
Zdenek, here is the file with PeterBartholomew1 formula, I only added substituting of blank date on empty string.
Formula shall be converted into your locale automatically.
- SergeiBaklanDiamond Contributor
Zdenec, if Power Query is an option that could be
let Source = Excel.CurrentWorkbook(){[Name="T_Data"]}[Content], #"Changed Type" = Table.TransformColumnTypes( Source, {{"Test date", type date}} ), #"Fake date" = Table.ReplaceValue( #"Changed Type", null,#date(9000, 12, 31), Replacer.ReplaceValue,{"Test date"} ), #"Sorted Rows" = Table.Sort( #"Fake date", {{"Name", Order.Ascending}, {"Test date", Order.Descending}} ), #"Fix in memory" = Table.AddIndexColumn( #"Sorted Rows", "Index", 0, 1, Int64.Type ), #"Keep distinct names" = Table.Distinct(#"Fix in memory", {"Name"}), #"Remove fake date" = Table.ReplaceValue( #"Keep distinct names", #date(9000, 12, 31), null,Replacer.ReplaceValue,{"Test date"} ), #"Remove Index" = Table.RemoveColumns(#"Remove fake date",{"Index"}) in #"Remove Index"
The idea is to temporary replace no date for not tested by far away date.
- Zdenek_MoravecBrass ContributorHello Sergei
Sort by name ascending, by date descending and remove duplicates - so easy is it!
The purpose of the Index column is to help Table.Distinct function to keep the first row? (I guess, because if I delete the index column from the process, the Table.Distinct function keeps the last row of each name ...). So how does the Table.Distinct work with the Index column?
Thank You
Zdenek- SergeiBaklanDiamond Contributor
The purpose of Index column is to fix the table in memory. Alternatively you may wrap it by Table.Buffer(). The only small difference you may manipulate with Index from UI only, Table.Buffer() shall be added manually in editor. But result is the same.
Without fixing in memory Table.Distinct could give wrong result (could not) due to lazy evaluation performed by Power Query.