Mar 16 2024 07:36 PM
Hi ,
I have this excel document which was exported from a PDF, so the format was not exact great.
it contains around 800 worksheet. In each worksheet it contains similar information which I need to extract ( a name & a location).
the problem is that sometimes this information is in A2 cell, sometime in A3, sometime A2-E5 merged cell....(the format is pretty inconsistent due to export from PDF)
The know fact is that on each sheet there is a constant word next to the information i want.
e.g
Location ABC
Reference 123
with "Location" & "Reference" as the constant word and "ABC" & "123" as the information i needed
any idea how i can extract these information on every sheet and put them in a list.
Thank you
Mar 17 2024 08:49 AM
I created a workbook as per your inputs. refer book1.
then created a power query to find desired output as shown in book2.
check it out if it fulfils your requirement.
Mar 17 2024 07:14 PM
SandeepMarwal:
Thanks for share the attachment.
sql:
create temp table aa as
select f01,f02,f03||f04||f05||f06||f07||f08||f09||f10||f11||f12||f13||f14||f15||f16||f17||f18||f19||f20||f21||f22||f23||f24||f25 other from consolidateSheet;
select f01,f02,max(regexp2('Location\s+(\w+)',other,1)) Location,max(regexp2('Reference\s+(\w+)',other,1)) Reference,* from aa group by f02;
f01 f02 Location Reference other
Book1 (2).xlsx | Sheet1 | ABC | 123 | Reference 123---------------------- |
Book1 (2).xlsx | Sheet10 | Location ------------------------- | ||
Book1 (2).xlsx | Sheet2 | DEF | 456 | Reference 456---------------------- |
Book1 (2).xlsx | Sheet3 | PQR | 000 | Reference 000---------------------- |
Book1 (2).xlsx | Sheet4 | GGG | 007 | Location GGG Reference 007---------------------- |
Book1 (2).xlsx | Sheet5 | SOS | 111 | Location SOS Reference 111---------------------- |
Book1 (2).xlsx | Sheet6 | RRR | 555 | Location RRR Reference 555---------------------- |
Book1 (2).xlsx | Sheet9 | XXX | Location XXX Reference ###---------------------- |