data form huge number of worksheet

Copper Contributor

Hi , 

 

I have this excel document which was exported from a PDF, so the format was not exact great. 
it contains around 800 worksheet.  In each worksheet it contains similar information which I need to extract ( a name & a location). 

the problem is that sometimes this information is in A2 cell, sometime in A3, sometime A2-E5 merged cell....(the format is pretty inconsistent due to export from PDF)

 

The know fact is that on each sheet there is a constant word next to the information i want.
e.g
Location   ABC
Reference 123

 

with "Location" & "Reference" as the constant word and "ABC" & "123" as the information i needed

any idea how i can extract these information on every sheet and put them in a list. 

 

Thank you  

2 Replies

@ttc526 

I created a workbook as per your inputs. refer book1.

then created a power query to find desired output as shown in book2.

check it out if it fulfils your requirement.

@ttc526 

 

SandeepMarwal:

Thanks for share the attachment.

 

sql:

create temp table aa as 

select f01,f02,f03||f04||f05||f06||f07||f08||f09||f10||f11||f12||f13||f14||f15||f16||f17||f18||f19||f20||f21||f22||f23||f24||f25 other from consolidateSheet;

select f01,f02,max(regexp2('Location\s+(\w+)',other,1)) Location,max(regexp2('Reference\s+(\w+)',other,1)) Reference,* from aa group by f02;

 

 

f01 f02 Location Reference other

Book1 (2).xlsxSheet1ABC123Reference 123----------------------
Book1 (2).xlsxSheet10  Location   -------------------------
Book1 (2).xlsxSheet2DEF456Reference 456----------------------
Book1 (2).xlsxSheet3PQR000Reference 000----------------------
Book1 (2).xlsxSheet4GGG007Location GGG Reference 007----------------------
Book1 (2).xlsxSheet5SOS111Location SOS Reference 111----------------------
Book1 (2).xlsxSheet6RRR555Location RRR Reference 555----------------------
Book1 (2).xlsxSheet9XXX Location XXX Reference ###----------------------