Getting Data from Multiple Web Pages -- Replicate changes made using Query Editor

Copper Contributor

I am getting data from Multiple Web pages into multiple excel worksheets. Each Web page has the same format, and the same changes need to be made to each worksheet using the Query Editor before I load the data into Excel.

 

Is there a way I can replicate the changes made to the first sheet, onto the 2nd sheet, and so on ?

 

There is an option to duplicate the in the Query editor, when I used the same, and duplicated one sheet, and changed the Data source, to be able to apply the same changes to the data from the 2nd source, it pulled the raw data, and the changes which I applied to the first sheet, were not replicated to the 2nd sheet.

 

Any suggestions as to how I can achieve this ?

 

End of the day, I want all the data from multiple web pages into one sheet, for which the append option works, but appends the data only from existing sheets, not from the web, so even that does not solve my issue.

 

 

13 Replies

Hi Gurpreet,

 

You may create the query for one web page, after that duplicate it from the menu (right click on the query) and use for another web page. Connection string to web page could be hardcoded or defined as parameter. Or you may create the function based on such query with URL as parameter.

 

And one more query on the top which appends all results and loads into the summary sheet.

"duplicate it from the menu (right click on the query) and use for another web page."

I cannot find any such option. I am using Excel 2016 on Office 365.

That's in Queries&Connections pane in Excel sheet or in Queries pane in Query editor

Duplicate02.JPGDuplicate1.JPG

I already tried duplicating, but as I said earlier -->

"There is an option to duplicate the in the Query editor, when I used the same, and duplicated one sheet, and changed the Data source, to be able to apply the same changes to the data from the 2nd source, it pulled the raw data, and the changes which I applied to the first sheet, were not replicated to the 2nd sheet."

If I understood you correctly you'd like to sync changes, i.e. if you changes something in Query1 it shall be reflected in Query2. If so the function is required which you'll be use in both queries. Duplication is just kind of copy/paste, duplicated query lives independently of the source. 

Thanks, it finally worked. I had to get into Data - Queries & Connections - Select the specific Table - Edit - Advanced Editor and then change the Source. 

However, with 30+ worksheets, the computer, started giving Out of Memory errors, had to upgrade Excel from 32 Bit to 64 Bit, now the Out of Memory errors are not coming, but the computer has become very slow. 

 


I have 100 more sheets, to pull, wondering what to do, does not seem excel can support so many worksheets.

 

Should I make multiple excel files, but then how do I combine all the data, as I am doing now, using Append.

Hi, I have the similar issue. Hope to seek for experts help.

 

my scenario is loading date from different page, e.g. ?id=0001, 0002, 0003....

Hi,

 

If pages are similar you may generate the query for one page, convert it into the function with ID as parameter and after that combine all together as needed.

Hi Sergei,

thank you for your reply but i don't have much knowledge on this part.

this is the page i would extract. http://www.cic.hk/common/srs/subcontractorDetail.aspx?id=R000001

i hope to capture content from id=R000001 to R0014500. Would you mind to show me the steps setting up?

thank you.

The function itself is simple, it just extract information for given ID

(pID as text) =>
let
    Source = Web.Page(Web.Contents(pURL,
        [Query = [id=pID]])),
    Data0 = Source{0}[Data]
in
    Data0

Question is what to do with that. As an example in the query I generated list of ID:s from 1 to 100, picked-up related info with function and expanded everything in one table

let
    Numbers = {1..100},  // ID:s to receive
    RegNumbers = List.Transform(Numbers, each "R" & Text.Start("000000",6-Text.Length(Text.From(_))) & Text.From(_)),
    ListToTable = Table.FromList(RegNumbers, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    RenameToId = Table.RenameColumns(ListToTable,{{"Column1", "ID"}}),
    AddRegInfoTables = Table.AddColumn(RenameToId, "Custom", each fnRegistrationParticulars([ID])),
    ExtractRegNumbers = Table.AddColumn(AddRegInfoTables, "Custom.1", each List.First([Custom][Column2])),
    FilterEmpty = Table.SelectRows(ExtractRegNumbers, each ([Custom.1] <> "")),
    ExpandInfo = Table.ExpandTableColumn(FilterEmpty, "Custom", {"Column1", "Column2"}, {"Column1", "Column2"}),
    KeepInfo = Table.SelectColumns(ExpandInfo,{"Column1", "Column2"})
in
    KeepInfo

Even with hundred ID:s it takes time - each individual request takes few seconds. Other option could be request them one by one if only few are required.

Please check attached.

Dear Sergei,

 

Much pleasure for your kind help. May I have help if how to load the format like attached?

 

- each record (id) start in new row

 

Thank you.

Hi,

 

When make transpose within the function, please see attached

Hi,

 

That would be my thoughts. Thank you so much for your support. Hope I would get it learnt.