Forum Discussion
Help Identifying Employees in a List
PaulyCA If it's just a list of employee names you need returned for files that were downloaded anonymously, the following formula might do the trick:
=UNIQUE(TOCOL(MAP(A2:A20, B2:B20, LAMBDA(a,b, IFS(COUNTIFS(A2:A20, a, B2:B20, "Anonymous")*(b<>"Anonymous"), b))), 2))
Alternatively, if you want to return both the filenames and employee names, you could try a using a double-filter formula as follows:
=LET(
anon_files, UNIQUE(FILTER(A2:A20, B2:B20="Anonymous")),
FILTER(A2:B20, ISNUMBER(XMATCH(A2:A20, anon_files))*(B2:B20<>"Anonymous"))
)
Please adjust the range references to meet your needs. If performance is an issue with 80K rows of data, the Advanced Filter feature can also be used with the same logic as the first formula shown above.
Please see the attached workbook...
- djclementsFeb 19, 2024Silver Contributor
PaulyCA As an improvement to my previous post using the TOCOL / IFS method, the MAP function is not needed as COUNTIFS can spill its results for the entire dataset by passing the entire criteria_range1 to the criteria1 argument:
=UNIQUE(TOCOL(IFS(COUNTIFS(A2:A20, A2:A20, B2:B20, "Anonymous")*(B2:B20<>"Anonymous"), B2:B20), 2))
This method can also be combined with the WRAPROWS function to return both the filenames and employee names if desired (the FILTER function is not needed here):
=UNIQUE(WRAPROWS(TOCOL(IFS(COUNTIFS(A2:A20, A2:A20, B2:B20, "Anonymous")*(B2:B20<>"Anonymous"), A2:B20), 2), 2))
See attached...
- m_tarlerFeb 19, 2024Bronze Contributor
djclements I like your use of countifs to make the conditional but might suggest using FILTER as a more direct output:
=FILTER(A2:B20,COUNTIFS(A2:A20,A2:A20,B2:B20,"Anonymous")*(B2:B20<>"Anonymous"),"")
then B2:B20 would output just the User and A2:B20 (as shown above) outputs both
- djclementsFeb 19, 2024Silver Contributor
m_tarler Fair point. I've been experimenting with TOCOL / IFS recently to see where it can be used. I've read that, in general, TOCOL is faster than FILTER on larger datasets; however, I just tested both formulas in this scenario with 80,000 rows of data, and they were both equally poor to extract the results (30 seconds each). The COUNTIFS function was to blame for the poor performance, though...
As it turns out, the double-filter formula I shared in my first post successfully processed 80,000 rows of data in a split second:
=LET( anon_files, UNIQUE(FILTER(A2:A80000, B2:B80000="Anonymous")), FILTER(A2:B80000, ISNUMBER(XMATCH(A2:A80000, anon_files))*(B2:B80000<>"Anonymous")) )
I probably should've done a full test on a large dataset before posting to see which method was worth sharing... oh well.
BTW, not sure if you noticed, but the formula you shared in your first post was virtually identical in structure to mine, with just a slight variation to return the employee names only. 😉