Aug 04 2022 05:34 AM
Dear Team,
I have a question in the context of Threat Intelligence search, where I wanted to standardize free-formed URL into a specific format of subdomain.domain.topleveldomain.
Sample URL:
The desired output is that KQL is smart enough to evaluate and cut those URLs to the following output:
I have used parse_url function which is useful, but aren't able to cut the URL to desired length format.
Thank you!
Sep 26 2022 05:28 AM
Solution//TLD Extraction
| extend TLD = extract(@'[a-zA-Z]{1,}\.[a-zA-Z]{2,}$',0,DNS_domain)
//Convert the entire domain into array
| extend f = split(DNS_domain,'.')
// format the subdomain part
| extend z = array_slice(f,0,-3)
// strcat the manupulated subdomain
| extend subdomain = strcat_array(z,".")
Feb 22 2024 06:29 AM
Hello,
To be more generic and extract sub-subdomains and drill deeper in the hierarchy you can use the following:
let extract_domain = (url_or_domain:string, lowest_level :int = 3){
let url_parts = split(extract(@":?([A-Za-z-0-9]+\.)+([A-Za-z-0-9]+)",0, url_or_domain),".");
let parts_count = array_length(url_parts);
let relevant_parts = iff(lowest_level > parts_count,
array_slice(url_parts, -parts_count, -1),
array_slice(url_parts, -lowest_level, -1)
);
strcat_array(relevant_parts,".")
}
;
datatable (url:string)["login.ezproxy.uni.simple.me","https://submit.owa.something.gov.eu","sample.me","sample.me_","sample.me/"]
| extend parsed = extract_domain(url,lowest_level=4)
Sep 26 2022 05:28 AM
Solution//TLD Extraction
| extend TLD = extract(@'[a-zA-Z]{1,}\.[a-zA-Z]{2,}$',0,DNS_domain)
//Convert the entire domain into array
| extend f = split(DNS_domain,'.')
// format the subdomain part
| extend z = array_slice(f,0,-3)
// strcat the manupulated subdomain
| extend subdomain = strcat_array(z,".")