Forum Discussion
Remove Duplicates from query
- Dec 08, 2021
I believe you may need to use MountTime in a Arg_max for this. Arg_max removes duplicates and you need to use a number time or date for this unfortunately.
The Arg_max would work below, so it will show the last MountTime for that FileName going to a specific USB. You may see duplicate FileNames but thats because its being saved to multiple USB devices.then create a project with the fields you want and in what order.
Snippet of code to replace your join in the code you pasted.
FileCreation | lookup kind=inner (UsbDriveMount) on DeviceId| where FolderPath startswith DriveLetter| where Timestamp >= MountTime| summarize arg_max(MountTime, *) by FileName, tostring(ProductName), tostring(SerialNumber)| project Timestamp, DeviceId, DeviceName, InitiatingProcessAccountDomain,InitiatingProcessAccountName,InitiatingProcessAccountUpn,FileName, FolderPath, SHA256, SensitivityLabel, IsAzureInfoProtectionApplied, DriveLetter, MountTime, ProductName, SerialNumber, Manufacturer| order by DeviceId asc, Timestamp descHope this helps, as I know you wanted it without using MountTime.
I believe you may need to use MountTime in a Arg_max for this. Arg_max removes duplicates and you need to use a number time or date for this unfortunately.
The Arg_max would work below, so it will show the last MountTime for that FileName going to a specific USB. You may see duplicate FileNames but thats because its being saved to multiple USB devices.
then create a project with the fields you want and in what order.
Snippet of code to replace your join in the code you pasted.
This is perfect and works a treat. Thank you so much.
The reason behind this was, due to the 10k limit in the CSV file, the duplicates were limiting my reporting. Your adjustment was the perfect solution, once again thank you.