Automating Data Collection Across Non-Sequentially-Named Sheets and Workbooks

Copper Contributor

I just had a project dropped in my lap that is a classic conflict between human-readable and machine-readable formats. 

I need to collect data across hundreds of sheets that were designed to be human-readable. So that means uniquely named sheets (DavidJackson, JohnDoe, MissyRivers, etc.) and multiple small tables within them. 

Thankfully, each sheet is formatted in exactly the same way.


Is there a way to build a summary table that can automate referencing certain cells (ex. =DavidJackson!B2, =JohnDoe!B2, =MissyRivers!B2) in order, without needing to track down and type each sheet name sequentially?

 

AnachronisticJam_1-1632327106502.png

 

5 Replies

@AnachronisticJam  so assuming a couple things these formulas might help you.

a) you need the sheet name to be added to each sheet.  so let's pretend cell A1 is available then add this formula there:

=LET(path,CELL("filename",A1),RIGHT(path,LEN(path)-SEARCH("]",path)))

b) you can then create the list of all sheet names using this formula which only requires you to enter the first and last sheet names and the cell from a) above:

=LET(sref,Sheet1:Sheet5!A1,TRIM(MID(TEXTJOIN(REPT(" ",100),,sref),SEQUENCE(COUNTA(sref),,1,100),100)))

 c) you can then create the remaining lookup values using INDIRECT() combining the sheet name from b) with the absolute cell reference you need

 

I just wanted to add a note with encouragement to modify altogether the way the workbook is structured. You're so right that it was created to be "user-friendly" and in the process actually interferes with letting Excel do what it excels at.
What you want, longer term, is at the most one sheet representing whatever it is in each "small table" simply adding a column to differentiate data re JohnDoe from that of MIssyRIvers, etc.

Even better, unless those tables are widely differing in their content, is to create a single table with all the data. Excel is really good at parsing apart a single database.

(This from a person who was director of the HR & Payroll database for a major corporation during my working career.)

@mathetes unfortunately, I didn't have any control over the design or collection stages of the project or this could have been avoided at the start. 

At this point going back and rebuilding everything is well outside of scope for the project, so I have to make the best of what I've got.

BTW in case you aren't aware, if you highlight multiple sheets (e.g. all the sheets) and then type a value or formula into a cell, that value or formula is placed in EVERYONE of the sheets highlighted. Basically to do step a) in my comment you can highlight all the sheets (i.e. click first sheet tab, then scroll to end and shift-click the last sheet tab) and paste that formula into whichever cell is 'available'.

To 'unselect' the range of sheets you can either select a previous unselected sheet or ctrl-click any sheet to unselect it and then click on that sheet.

Then in step b) just update that first part of the formula with the correct starting and ending sheet names and instead of A1 put which ever cell you pasted that formula from a) into.

@AnachronisticJam 

 

Let me add, then, in addition to the suggestions you've gotten from @mtarler , that you look into the INDIRECT function. I use that on one of my more ambitious workbooks (dealing with investing in various issues) where I use a formula like this

=INDIRECT($B5)&"!"&A$1) 

to refer to the sheet with the name found in B5  (column B contains a list of sheet names of other sheets in the workbook)

and A$1 refers to a specific cell in whatever sheet is named by B5.

 

That formula, with its absolute and relative references is used to populate a summary sheet built from individual sheets that, like yours, have common data in the same cells. In your case, this would require a one-time effort (perhaps a macro?) to create in the Summary sheet a column with each sheet's name.

 

I'm attaching a copy of a sample of that workbook. The "Current Summary" sheet is the one that uses INDIRECT

 

Here, too, is a good website for further info on using INDIRECT. https://exceljet.net/excel-functions/excel-indirect-function