Forum Discussion

AmjadGov's avatar
AmjadGov
Copper Contributor
Dec 03, 2021
Solved

Remove Duplicates from query

I'm running the following Script to view USB file activity, however it produces filename duplicates, which I am having issues with removing, I have tried playing with the summarize arg_max() but had ...
  • MattBurrows's avatar
    Dec 08, 2021

    AmjadGov 

    I believe you may need to use MountTime in a Arg_max for this. Arg_max removes duplicates and you need to use a number time or date for this unfortunately.

    The Arg_max would work below, so it will show the last MountTime for that FileName going to a specific USB. You may see duplicate FileNames but thats because its being saved to multiple USB devices.

     

    then create a project with the fields you want and in what order.

     

    Snippet of code to replace your join in the code you pasted.

     

    FileCreation | lookup kind=inner (UsbDriveMount) on DeviceId
    | where FolderPath startswith DriveLetter
    | where Timestamp >= MountTime
    | summarize arg_max(MountTime, *) by FileName, tostring(ProductName), tostring(SerialNumber)
    | project Timestamp, DeviceId, DeviceName, InitiatingProcessAccountDomain,
    InitiatingProcessAccountName,InitiatingProcessAccountUpn,
    FileName, FolderPath, SHA256, SensitivityLabel, IsAzureInfoProtectionApplied, DriveLetter, MountTime, ProductName, SerialNumber, Manufacturer
    | order by DeviceId asc, Timestamp desc
     
    Hope this helps, as I know you wanted it without using MountTime.

Resources