Help Identifying Employees in a List

Question

Hi!&nbsp;So I have a list of about 80K rows.&nbsp; It contains 80K file transfers that were either uploaded by employees, but many of them are downloaded anonymously.&nbsp; I am trying to match the anonymous download to the employee who uploaded it.&nbsp; Example&nbsp;FileA shows it was downloaded anonymously and if I filter the File Column by the word "FileA" I will see that the file was downloaded, and I can also see the file was uploaded by John Smith (our employee)&nbsp;Not every file in this list was downloaded anonymously.&nbsp; That is the only criteria I am interested in.&nbsp; The outcome of this list will be a report of employees (probably around 50 or so out of 80K lines) who use this system to upload files with the purpose of them being downloaded anonymously.&nbsp;&nbsp;&nbsp;So if there is a way to scan the entire list of File Names then generate the output of the employee name who uploaded it.&nbsp; that would be great!

djclements · Answer

m_tarler&nbsp;Fair point. I've been experimenting with TOCOL / IFS recently to see where it can be used. I've read that, in general, TOCOL is faster than FILTER on larger datasets; however, I just tested both formulas in this scenario with 80,000 rows of data, and they were both equally poor to extract the results (30 seconds each). The COUNTIFS function was to blame for the poor performance, though...&nbsp;As it turns out, the double-filter formula I shared in my first post successfully processed 80,000 rows of data in a split second:&nbsp;=LET(
    anon_files, UNIQUE(FILTER(A2:A80000, B2:B80000="Anonymous")),
    FILTER(A2:B80000, ISNUMBER(XMATCH(A2:A80000, anon_files))*(B2:B80000&lt;&gt;"Anonymous"))
)&nbsp;I probably should've done a full test on a large dataset before posting to see which method was worth sharing... oh well.&nbsp;BTW, not sure if you noticed, but the formula you shared in your first post was virtually identical in structure to mine, with just a slight variation to return the employee names only. 😉

hansvogelaar · Answer

PaulyCA&nbsp;
=SORT(UNIQUE(FILTER(uploaded_by_column, downloaded_by_column="Anonymous")))

paulyca · Answer

HansVogelaar&nbsp; - Thank you for your reply.&nbsp; We're getting close.&nbsp; Let's see if we can knock this out with more specifics&nbsp;Here is an example of what the sheet would look like.&nbsp;&nbsp;This shows that FileA was downloaded by Anonymous, but actually uploaded by John Smith (The file name will match every time to whomever downloads the data anonymously, but only show a single line for an upload by a single employee.&nbsp; That's who I am looking for.&nbsp;&nbsp;Same thing applies to File B -- that was downloaded several times by anonymous, but only uploaded by Jane&nbsp;So the output I am looking at in the 80K records is the list of employees names in Column B who uploaded something, that was downloaded by User "Anonymous"&nbsp;I know this is really hard to explain -- appreciate the help!&nbsp;&nbsp;Column AColumn BFileAAnonymousFileAAnonymousFileAAnonymousFileAAnonymousFileAJohn SmithFileBAnonymousFileBAnonymousFile BJane Smith

hansvogelaar · Answer

PaulyCA&nbsp;
It was confusing since you had FileB and File B.
See the attached demo for a solution using an intermediate helper range that might be useful.

djclements · Answer

PaulyCA&nbsp;If it's just a list of employee names you need returned for files that were downloaded anonymously, the following formula might do the trick:&nbsp;=UNIQUE(TOCOL(MAP(A2:A20, B2:B20, LAMBDA(a,b, IFS(COUNTIFS(A2:A20, a, B2:B20, "Anonymous")*(b&lt;&gt;"Anonymous"), b))), 2))&nbsp;Alternatively, if you want to return both the filenames and employee names, you could try a using a double-filter formula as follows:&nbsp;=LET(
    anon_files, UNIQUE(FILTER(A2:A20, B2:B20="Anonymous")),
    FILTER(A2:B20, ISNUMBER(XMATCH(A2:A20, anon_files))*(B2:B20&lt;&gt;"Anonymous"))
)&nbsp;Please adjust the range references to meet your needs. If performance is an issue with 80K rows of data, the Advanced Filter feature can also be used with the same logic as the first formula shown above.&nbsp;Please see the attached workbook...

Forum Discussion

Help Identifying Employees in a List

Share

Column A	Column B
FileA	Anonymous
FileA	Anonymous
FileA	Anonymous
FileA	Anonymous
FileA	John Smith
FileB	Anonymous
FileB	Anonymous
File B	Jane Smith