Forum Discussion
Get Data from Web
- May 11, 2017
Yury, in general that's possible, in the code you have you have to use instead of Web.Page(Web.Contents(...)) just only Web.Contents(...) and after that parse the code more or less manually. I did that to extract some information from this Tech Comm site.
One of the examples for such approach is http://datachant.com/2017/05/08/web-scraping-power-bi-part-2/, the latest i've seen, but Gil published more on his site.
However not sure that's practical for this concrete case.
Hi Sachin,
I am not an expert in websites, however, as far as I have heard, if 'Document' is the only item you see in the 'Navigator' menu when connecting to a website (so there are no other tables visible), the website is likely to be using Java Script. This makes it highly unlikely that you will be able to extract any useful information from it. Apparently, this issue is being worked on by Microsoft.
Yury
Yury, in general that's possible, in the code you have you have to use instead of Web.Page(Web.Contents(...)) just only Web.Contents(...) and after that parse the code more or less manually. I did that to extract some information from this Tech Comm site.
One of the examples for such approach is http://datachant.com/2017/05/08/web-scraping-power-bi-part-2/, the latest i've seen, but Gil published more on his site.
However not sure that's practical for this concrete case.
- Yury TokarevMay 12, 2017Iron Contributor
That's a good point. Thanks Sergei. I can see that with a bit of an effort we can extract the required info using Power Query transformations. We can start with filtering rows starting with "span id=" (from the extracted HTML text we get by following the initial steps outlined in Sergei's the datachant.com link), then locating field labels (e.g. lbl_companyname for Company etc), and further disseminate data from there.
Sachin, hope you find it helpful
Cheers
Yury