Forum Discussion
download data from web
- Dec 17, 2024
Hi manny213 ,
The safest, most reliable approach would be to retrieve the data using the API supplied by INRA:
- Home | FINRA API Developer Center
- Documentation | FINRA API Developer Center
- Catalog of Datasets | FINRA API Developer Center
I've included an alternative hack below for fetching the files you mentioned, but while this works today, there's many reasons it could fail in the future and so I don't recommend this approach.
Get-FinraFiles.ps1
[cmdletbinding()] param() # Request URI. $uri = 'https://www.finra.org/views/ajax?_wrapper_format=drupal_ajax&custom_month%5Bmonth%5D=any&custom_year%5Byear%5D=any'+ '&view_name=transparency_services&view_display_id=equity_short_interest_biweekly&view_args=&view_path=%2Fnode%2F336166'+ '&view_base_path=&view_dom_id=6df270b0453b3e8d5fc32607f5a986f1d230dcb33864626710bc37237fa9aec2&pager_element=0&_drupal_ajax=1'+ '&ajax_page_state%5Btheme%5D=finra_bootstrap_sass&ajax_page_state%5Btheme_token%5D='+ '&ajax_page_state%5Blibraries%5D=addtoany%2Faddtoany.front%2Cbetter_exposed_filters%2Fauto_submit%2Cbetter_exposed_filters'+ '%2Fgeneral%2Cblazy%2Fbio.ajax%2Cbootstrap_barrio%2Fbreadcrumb%2Cbootstrap_barrio%2Fform%2Cbootstrap_barrio%2Fgesta_opensans'+ '%2Cbootstrap_barrio%2Fglobal-styling%2Cchosen%2Fdrupal.chosen%2Cchosen_lib%2Fchosen.css%2Cfinra_bootstrap_sass'+ '%2Fapp-dynamic-reporting%2Cfinra_bootstrap_sass%2Fback-button-handler%2Cfinra_bootstrap_sass%2Fcookie-classification'+ '%2Cfinra_bootstrap_sass%2Fgesta%2Cfinra_bootstrap_sass%2Fglobal-styling%2Cfinra_bootstrap_sass%2FglossaryViews'+ '%2Cfinra_bootstrap_sass%2Fopensans%2Cfontawesome%2Ffontawesome.webfonts%2Cfontawesome%2Ffontawesome.webfonts.shim'+ '%2Cparagraphs%2Fdrupal.paragraphs.unpublished%2Csuperfish%2Fsuperfish%2Csuperfish%2Fsuperfish_hoverintent%2Csuperfish'+ '%2Fsuperfish_supersubs%2Csuperfish%2Fsuperfish_supposition%2Csystem%2Fbase%2Cviews%2Fviews.ajax%2Cviews%2Fviews.module'; # Destination folder. $destination = "D:\Data\Temp\Forum\finra"; # Invoke the web call. $data = Invoke-RestMethod -Method Get -Uri $uri -UseBasicParsing -ErrorAction:Stop; #region Use BITS to download the files. $bitsFiles = [regex]::Matches($data[2].data, "https.*\.csv", [System.Text.RegularExpressions.RegexOptions]::IgnoreCase).Value | ForEach-Object { $parts = $_.Split("/"); $filename = $parts[$parts.Length - 1]; [PSCustomObject] @{ Source = $_; Destination = "$destination\$filename"; } }; $bitsJobName = "forumExample"; $bitsFiles | Start-BitsTransfer -DisplayName $bitsJobName -ErrorAction:Stop; Get-BitsTransfer -Name $bitsJobName | Remove-BitsTransfer; #endregion
Cheers,
Lain
Hi manny213 ,
The safest, most reliable approach would be to retrieve the data using the API supplied by INRA:
- Home | FINRA API Developer Center
- Documentation | FINRA API Developer Center
- Catalog of Datasets | FINRA API Developer Center
I've included an alternative hack below for fetching the files you mentioned, but while this works today, there's many reasons it could fail in the future and so I don't recommend this approach.
Get-FinraFiles.ps1
[cmdletbinding()]
param()
# Request URI.
$uri = 'https://www.finra.org/views/ajax?_wrapper_format=drupal_ajax&custom_month%5Bmonth%5D=any&custom_year%5Byear%5D=any'+
'&view_name=transparency_services&view_display_id=equity_short_interest_biweekly&view_args=&view_path=%2Fnode%2F336166'+
'&view_base_path=&view_dom_id=6df270b0453b3e8d5fc32607f5a986f1d230dcb33864626710bc37237fa9aec2&pager_element=0&_drupal_ajax=1'+
'&ajax_page_state%5Btheme%5D=finra_bootstrap_sass&ajax_page_state%5Btheme_token%5D='+
'&ajax_page_state%5Blibraries%5D=addtoany%2Faddtoany.front%2Cbetter_exposed_filters%2Fauto_submit%2Cbetter_exposed_filters'+
'%2Fgeneral%2Cblazy%2Fbio.ajax%2Cbootstrap_barrio%2Fbreadcrumb%2Cbootstrap_barrio%2Fform%2Cbootstrap_barrio%2Fgesta_opensans'+
'%2Cbootstrap_barrio%2Fglobal-styling%2Cchosen%2Fdrupal.chosen%2Cchosen_lib%2Fchosen.css%2Cfinra_bootstrap_sass'+
'%2Fapp-dynamic-reporting%2Cfinra_bootstrap_sass%2Fback-button-handler%2Cfinra_bootstrap_sass%2Fcookie-classification'+
'%2Cfinra_bootstrap_sass%2Fgesta%2Cfinra_bootstrap_sass%2Fglobal-styling%2Cfinra_bootstrap_sass%2FglossaryViews'+
'%2Cfinra_bootstrap_sass%2Fopensans%2Cfontawesome%2Ffontawesome.webfonts%2Cfontawesome%2Ffontawesome.webfonts.shim'+
'%2Cparagraphs%2Fdrupal.paragraphs.unpublished%2Csuperfish%2Fsuperfish%2Csuperfish%2Fsuperfish_hoverintent%2Csuperfish'+
'%2Fsuperfish_supersubs%2Csuperfish%2Fsuperfish_supposition%2Csystem%2Fbase%2Cviews%2Fviews.ajax%2Cviews%2Fviews.module';
# Destination folder.
$destination = "D:\Data\Temp\Forum\finra";
# Invoke the web call.
$data = Invoke-RestMethod -Method Get -Uri $uri -UseBasicParsing -ErrorAction:Stop;
#region Use BITS to download the files.
$bitsFiles = [regex]::Matches($data[2].data, "https.*\.csv", [System.Text.RegularExpressions.RegexOptions]::IgnoreCase).Value | ForEach-Object {
$parts = $_.Split("/");
$filename = $parts[$parts.Length - 1];
[PSCustomObject] @{
Source = $_;
Destination = "$destination\$filename";
}
};
$bitsJobName = "forumExample";
$bitsFiles | Start-BitsTransfer -DisplayName $bitsJobName -ErrorAction:Stop;
Get-BitsTransfer -Name $bitsJobName | Remove-BitsTransfer;
#endregion
Cheers,
Lain
- manny213Dec 17, 2024Brass Contributor
Thank you LainRobertson
I didn't know about that API. I will take a look. I agree that API is the best approach vs scraping.
I ran the script and it downloaded the files. Thank you!!