Forum Discussion
Scroll Down WebPage until End
Hi, Yeu.
You can't "scroll down" in a native PowerShell script, but this doesn't mean you need to resort to a tool like Selenium, either.
The FBI has its own REST API which is sufficient for what you're trying to do (noting you have a second thread going on this topic), however, the cyber category isn't individually searchable. It seems cyber can only be obtained using the default category.
Anyhow, here's an example script that you can include further filtering in if you so desire.
Note: You will need to install the ThreadJob PowerShell module for this example, as noted in the comments at the top of the script.
Example
# This script leverages the FBI REST API described at:
# https://api.fbi.gov/docs#!/Wanted/get_wanted
# This script also relies on the ThreadJob module (for downloading files in parallel) being installed from the official PSGallery repository:
# Install-Module -Name ThreadJob -Scope AllUsers;
$SaveLocation = "D:\Data\Temp\Forum\fbi\";
# Do not set PageSize too high as this also relates to how many concurrent downloads will be kicked off.
$PageSize = 20;
$Page = 1;
$UriBase = "https://api.fbi.gov/@wanted";
$Category = "default";
while (0 -ne ($Response = Invoke-RestMethod -Method Get -URI "$UriBase`?poster_classification=$Category&pageSize=$PageSize&page=$Page" -ContentType "application/json" -UseBasicParsing -ErrorAction:Stop).total)
{
$Downloads = @();
$Job = 0;
try
{
foreach ($Url in ($Response.Items.Files | Where-Object { ($_.Name -eq "English") -and ($_.url.EndsWith(".pdf")) }).url)
{
$FileName = "$SaveLocation$(($Parts = $Url.Split("/"))[-3])_$($Parts[-2]).pdf";
$Downloads += Start-ThreadJob -Name "fbi$($Job.ToString('X3'))" -ScriptBlock {
Invoke-WebRequest -Uri $using:Url -OutFile $using:FileName -ErrorAction:Stop;
};
$Job++;
}
Wait-Job -Job $Downloads | Out-Null;
Receive-Job -Job $Downloads -ErrorAction:Continue | Out-Null;
Remove-Job -Job $Downloads -ErrorAction:SilentlyContinue -Force;
$Downloads.Dispose();
}
catch
{
throw;
}
finally
{
$Page++;
}
}
Edited: To remove the two-page limit I'd been using for testing (as there's nearly 1,000 records in the default category.)
Cheers,
Lain
- LainRobertsonSep 08, 2023Silver Contributor
The command for installing the ThreadJob module (which is a first-party Microsoft module) can be seen on line five of the script.
I cannot help with Selenium at all as I do not work with it.
Cheers,
Lain
- YeuHarngSep 10, 2023Brass Contributorso for above u write the code can download all the file from the FBI link? because my job is want download all the pdf file from the link
- LainRobertsonSep 11, 2023Silver Contributor
The script downloads the English PDF (some people have PDFs attached in more than one language, such as Spanish) for each person in the FBI's "wanted" list.
Cheers,
Lain
- YeuHarngSep 08, 2023Brass ContributorHi, cna i ask how to import the Selenium, because i have many issue on Selenium module.