Forum Discussion

ttc526's avatar
ttc526
Copper Contributor
Mar 17, 2024

data form huge number of worksheet

Hi , 

 

I have this excel document which was exported from a PDF, so the format was not exact great. 
it contains around 800 worksheet.  In each worksheet it contains similar information which I need to extract ( a name & a location). 

the problem is that sometimes this information is in A2 cell, sometime in A3, sometime A2-E5 merged cell....(the format is pretty inconsistent due to export from PDF)

 

The know fact is that on each sheet there is a constant word next to the information i want.
e.g
Location   ABC
Reference 123

 

with "Location" & "Reference" as the constant word and "ABC" & "123" as the information i needed

any idea how i can extract these information on every sheet and put them in a list. 

 

Thank you  

  • peiyezhu's avatar
    peiyezhu
    Bronze Contributor

    ttc526 

     

    SandeepMarwal:

    Thanks for share the attachment.

     

    sql:

    create temp table aa as 

    select f01,f02,f03||f04||f05||f06||f07||f08||f09||f10||f11||f12||f13||f14||f15||f16||f17||f18||f19||f20||f21||f22||f23||f24||f25 other from consolidateSheet;

    select f01,f02,max(regexp2('Location\s+(\w+)',other,1)) Location,max(regexp2('Reference\s+(\w+)',other,1)) Reference,* from aa group by f02;

     

     

    f01 f02 Location Reference other

    Book1 (2).xlsxSheet1ABC123Reference 123----------------------
    Book1 (2).xlsxSheet10  Location   -------------------------
    Book1 (2).xlsxSheet2DEF456Reference 456----------------------
    Book1 (2).xlsxSheet3PQR000Reference 000----------------------
    Book1 (2).xlsxSheet4GGG007Location GGG Reference 007----------------------
    Book1 (2).xlsxSheet5SOS111Location SOS Reference 111----------------------
    Book1 (2).xlsxSheet6RRR555Location RRR Reference 555----------------------
    Book1 (2).xlsxSheet9XXX Location XXX Reference ###----------------------

Share