Forum Discussion

dwatters's avatar
dwatters
Copper Contributor
Jan 16, 2024

web scraping help

i have an old macro that i use alot to pull information off of multiple sites. it was originally setup to use IE but now that site no longer loads correctly in IE. im not really sure how to go about moving it to another browser due to what i have been reading about other browsers not supporting the DOM. im pretty new to writing macros and the original maker is long gone. this is a part of the code

'Gets the info from AM Console
Function ImportAmConsoleData(url) As String()

'Open and hide Internet Explorer
Dim appIE As InternetExplorerMedium
Dim objElement As HTMLDocument
Set appIE = New InternetExplorerMedium
With appIE
.Navigate url
.Visible = True
End With

Do
DoEvents
'If there was an error, the website was not done loading. return here on error and try again
On Error Resume Next

Set objElement = appIE.Document
Dim bodyText As String
bodyText = objElement.body.innerText

'Give AM console a bit of time before checking again
Sleep 50

'check to see if document as loaded completly with all required info.
'cannot check ready state, I believe this is because AM Console was written with Angular and will
'set the ready state to complete even though it is still waiting for info from the server.
hasName = InStr(bodyText, FIRST_NAME_SEARH) > 0
hasID = InStr(bodyText, ID_SEARCH) > 0
hasPhone = InStr(bodyText, PHONE_SEARCH) > 0
hasEmail = InStr(bodyText, EMAIL_SEARCH) > 0
hasDTAndBGC = InStr(bodyText, DRUG_TEST_SEARCH) > 0 And InStr(bodyText, BGC_COMPLETE_SEARCH) > 0
hasEverything = hasName And hasID And hasPhone And hasEmail And hasDTAndBGC


'add a dot to the elipsis to show the program is running
UpdateRunning

On Error GoTo 0
Loop Until hasEverything

'reset and close IE
appIE.Quit
Set appIE = Nothing

Dim dataArray(0 To 6) As String
dataArray(0) = findInfo(bodyText, FIRST_NAME_SEARH)
dataArray(1) = findInfo(bodyText, LAST_NAME_SEARCH)
dataArray(2) = findInfo(bodyText, ID_SEARCH)
dataArray(4) = findInfo(bodyText, PHONE_SEARCH)
dataArray(5) = findInfo(bodyText, EMAIL_SEARCH)

'start looking where onboarding tasks are located as to not pull from the dictionary
OnboardingStart = InStr(bodyText, ONBOARDING_SEARCH)
dataArray(6) = findInfo(bodyText, BGC_COMPLETE_SEARCH, OnboardingStart, 2) & "/" & findInfo(bodyText, DRUG_TEST_SEARCH, OnboardingStart, 2)

ImportAmConsoleData = dataArray

End Function

  • peiyezhu's avatar
    peiyezhu
    Bronze Contributor
    I gurss python selenium suitable to Chrome webkit.
    Xmlhttp send request directly.

Share