Jun 02 2020 08:57 AM - edited Jun 02 2020 08:58 AM
Jun 02 2020 09:15 AMSolution
@Sanket26 Yes. See the following for an example:
Jun 03 2020 07:07 AM
I've just created a file for you to try that you can access on-the-fly using the KQL query:
externaldata (UserID:string, DomainName:string) [@"https://raw.githubusercontent.com/jjsantanna/test_csv/master/ioc.csv"] with (format="csv",ignoreFirstRecord=true)
This is the easiest way to access external data. You can also create a blob within Azure and call from it. You can also read external text file, json, and many others.
Does this answer your question?
Jun 03 2020 01:34 PM
The issue which I am facing right now is : The html page where the csv is hosted isn't in desired format (There are multiple lines of header before the actual data). Also downloading the file, modifying the format and then uploading to a blob isn't the best option for me.
I am getting this error :
Partial query failure: Wrong number of fields (E_WRONG_NUMBER_OF_FIELDS). (message: 'Kusto::Csv::Parser<>.PrepareFields: CSV has an inconsistent number of fields per line: ', details: 'Offending record: 10 (start position in stream: 531), fieldsCount: 4, currentRecordFieldCount: 4, record: # ja3_md5,Firstseen,Lastseen,Listingreason
Jun 03 2020 01:39 PM
Jun 03 2020 01:56 PM
Please find the link details : https://sslbl.abuse.ch/blacklist/ja3_fingerprints.csv
Also yes I was running this data in Azure sentinel.
Jun 03 2020 02:13 PM
There you go @Sanket26
externaldata (Everything:string) [@"https://sslbl.abuse.ch/blacklist/ja3_fingerprints.csv"] with (format="txt",ignoreFirstRecord=true) // reading each line as a string | where Everything !startswith "#" //removing the lines that started with '#' | project Everything=parse_csv(Everything) // parsing the string as csv | project ja3_md5=Everything,Firstseen=Everything,Lastseen=Everything, Listingreason=Everything //splitting the csv into columns
I've added some comments for you to know what I was doing.
Let me know if this was helpful!
Jun 03 2020 02:40 PM
Thank a lot. It really helped. The issue is resolved. I am now able to fetch data directly from the http page. The part I was missing was I didn't perform the parsing on the csv as a result I wasn't getting the schema as expected.
Jun 03 2020 02:51 PM
@Sanket26 Maybe its just me...but we are talking about security, right? While you *can* access data over https/http/remote_locations, is that really a best practice? The link I provided to the information earlier was to ensure that your blacklist/whitelist information was being stored within your own tenant/network. I suspect, if you have Analytics Rules enabled, that URL you shared may show up as an entity for an investiation. :)
Jun 03 2020 03:05 PM
I totally agree to your point. The https link I provided is just a sample to identify if there is any option where these feeds can be directly utilized in kusto. Accessing those data over third party https/http/remote_locations is definitely not a best security practice. We will be uploading these to our internal websites and from there we will be accessing those.
Let me know if this clarifies your concern?
Jun 03 2020 03:09 PM
Aug 20 2020 08:49 AM
Great content and examples! Thank you for taking the time to share. Do you know if you can have a dynamic StorageConnectionString rather than a static one (@"https://storageaccount.blob.core.windows.net/storagecontainer/users.txt")? Would like to query something like "https://source/json/.../173.0.x.x" @jjsantanna