Formula speed / Laptop specification

Copper Contributor

Hello

I am a regular user of the index match formula between spreadsheets containing over 400,000 records and this can take up to 30 minutes to complete.  On smaller spreadsheets the formula regularly takes 5 to 10 minutes.  I have tried a variety of ways to improve the speed:

 

working on my hard drive instead of my work network;

always pasting values;

turn on manual calculation;

look for ways of cutting down the data I need to analyse.

 

Our IT department tell me I have the maximum ram on my laptop which is a core i5 elitebook.  they loaned me a notebook, the highest spec they provide, but it made little to no difference so I didn't think it was worth the expense.  No one in IT seems to be able to direct me to a solution other than to tell me to use other software but then can't tell me what that is other than Tableau which just wouldn't work for me or my department.  On top of the Excel 13 I already had they have me Excel 2010 64 but this just causes issues and sometimes the formula doesn't recognise i'm working between 2 spreadsheets so I have to close and reopen.

 

I am not well up on laptop specifications so can't tell you exactly what spec the notebook was.  they even gave me a virtual desktop which made no difference at all.

 

The CPU on my laptop occasionally reaches 99% but even when it doesn't it is extremely slow during calculations.

 

I am currently working from home too but to be fair I had the same problem when I worked in the office and its been going on for 2 years.  We use citrix but am more than happy to work of the hard drive if the formula worked faster, which it currently doesn't. 

 

Is there anyone who can advise me on the spec of laptop that I would need to quickly carry out such a formula or am I destined to sit watching the % on my status bar crawl to 100% for the rest of my days?  Is this what other people have to deal with?

 

All advice/direction is welcomed.

 

Allison

18 Replies

@Al_Dyas You haven't really explained what you are indexing and matching. Comparing/matching such large data sets could perhaps be done with Power Query. It allows you to connect to the data without loading it into an Excel sheet. Not sure, though, if this could work you.

Exact match lookups can take a very long time with large spreadsheets. You might consider an alternate design, perhaps a database or using power query as suggested.

Or, if your data can be sorted on the column containing your key, then you might make use of the approximate match option. For example, say the table is in A2:B8 and the value I'm trying to match is in D2, then use an IF statement to return the value it actually matched to and confirm it is an exact match and, if so, return the value I want out of Column B. Approximate match lookups used in this fashion (binary lookup) are exponentially faster on large data sets.

=IF(INDEX(A2:A9,MATCH(D2,A2:A9,1))=D2,INDEX(B2:B9,MATCH(D2,A2:A9,1)),0)

Thank you @Riny_van_Eekelen.  I have no idea what a power query is so I will find out and try that.

 

A sample of the formula I use is this:

=INDEX('20201109 MOSL_MDS_SSPID.xlsx'!$I2:$I229079,MATCH(B2,'20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,0))

 

There are many different scenarios but I use the same formula structure for the majority.

 

From a 2016 file I am looking at a 2019 file to see how the data changed, or if it has been removed.  For example the MATCH is looking for the exact sewerage point id (SPID) reference numbers in both files.  The index returns the status of the SPID in the 2019 file.

 

So the index match formula answer 2 question for me, is the SPID still active and if it is what is the status.

 

In another scenario: I have carried out a process to batch update the status field of a large number of SPIDs overnight and the following day I run an index match against today's report and yesterday's report to check that the update worked.  So again I am looking to see if the status of the SPID has changed as it should have.

 

I hope this helps and makes sense.

Thank you@JMB17 I think I am starting to realise that I need to use 2 formula instead of 1.

 

I hadn't made in clear in my initial post that I am always working between 2 files.  First to see if the record exists in both files and second the return an exact bit of data from each record.  So i'm asking for 2 outcomes and i'm begging to think I should be using 2 different formulae?

 

I will definitely look at power queries, as I said in my last reply, I've never heard of these but will do some digging.

 

I work pretty much alone so no one to bounce ideas off so I am very grateful for the replies I have received.

 

Allison

 

 

I would use two columns to make the formulas shorter/easier to follow. Also, if you are returning multiples columns, you only have to test once (the other formulas can just reference the cell that performs the test).

If your table in 20201109 MOSL_MDS_SSPID.xlsx can be sorted in ascending order on Column A, then you could try:

Test for existence (let's say this formula is in cell C2, note the match option is 1 and not 0):
=INDEX('20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,MATCH(B2,'20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,1))=B2

Then test C2 and return the value from Column I, or return "Doesn't Exist" (again - note the match option is 1):
=IF(C2,INDEX('20201109 MOSL_MDS_SSPID.xlsx'!$I2:$I229079,MATCH(B2,'20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,1)),"Doesn't Exist")

The key is that the data in your table MUST be able to be sorted in ascending order on your key column (in this case, Column A). And, since we are using an approximate match, you must perform the lookup twice - first to confirm the key it matched to is the key you are trying lookup. Then, if the key matches, perform the lookup and return the value you actually want (which makes the approximate match lookup an exact match lookup). Because the approximate match lookup method is much more efficient, you can speed up the calculation in spite of the fact that you are performing the lookup twice.

There's a short explanation here, but you can find more information if you google 'excel binary lookup.'

https://www.powerusersoftwares.com/post/2016/12/20/make-your-vlookup-run-a-hundred-times-faster-with...

@JMB17

 

I am speechless, each formula worked instantly.  I will try them in my other work.  I am really trying not to get too excited before I've properly tested it out but that was so much quicker.

 

I have google searched a few times, when I've had a few minutes spare, but never found anything that made a difference.  I obviously wasn't looking in the right place or often enough.

 

I've just had a look at power query and yes I think that is worth spending more time investigating too think you are all amazing and so kind for sharing.

 

Thank you, I am so grateful.  You have no idea how helpful you have just been.

Allison

@Al_Dyas 

 

The method that approximate matches use are much more efficient due to the data being sorted. The exact match option is linear, the function starts at the top and looks at every value until it finds what it's looking for. But, the approximate match works by repeatedly 'guessing' by selecting the value in the middle of your table and testing it for a match. If it is greater than what you're trying to match, then it throws away the bottom half of the table. If less than, then it throws away the top half (made possible because the data is sorted).

 

So, in this fashion, it can throw away half the table with each guess and is exponentially faster (it would only require 20 guesses to perform a lookup on a table that took up all 1 million rows of a worksheet - and the table size could double, but it would only add 1 more guess the function would have to perform).

That makes sense. Every day is a school day! I will be keeping all this information in a reference file and also sharing it with my colleagues at work. This will definitely help one or two of them who I know use the same index match formula I do, just not as often.

Thank you @JMB17. I will test this out thoroughly and also check out power query and will feedback here.
Have a smashing weekend
Allison
I used to compare it to how you would look numbers up in a phone book (you wouldn't start at the beginning and check every name, but flip to the middle and see where you land). But, that analogy may be lost on the newer generations as it may be before their time, lol.

Definitely follow through on looking at power query and have a great weekend!

@Al_Dyas 

Reasons for slow computation and poor performance

Volatile functions slow your performance

The term volatile, simply put, means that these formulas are always recalculated.

Conditional formatting makes your file slow

Many complex formulas degrade your performance

Incorrect or too large references lead to a loss of speed

Large workbooks result in slower worksheets

 

Increase the speed and improve the performance

Save your file as a binary workbook for better performance

Use pivot tables

Use conditional formatting carefully for better performance

Avoid volatile functions for more performance

Customize the reference areas and use structured references and names

Improve your formula structure for more speed

Avoid matrix formulas in your files

Replace formulas with fixed values

Avoid links between workbooks

Sort your source data for faster calculations

Use VBA for better performance

Switch to manual calculation

 

Can all workbooks be made faster?

Check links in your file

Remove incorrect or unnecessary formulas

Delete empty cells from memory

Check if you are using volatile functions

Check the references in your formulas

Check your formulas and conditional formatting

 

Finally, I can only recommend you, if none of the above steps help then you should read this article.

These go into detail on the topic and offer alternative paths.

The articles are sometimes a bit very technical and best only for advanced users but still very helpful.

Excel performance: Improving calculation performance

Excel performance: Tips for optimizing performance obstructions

 

Hardware: I recommend an SSD hard drive and at least 16GB RAM (when it comes to that many data sets).

At the same time check whether all processors are activated.

Instructions for the check (example with processor i7):

Press the key combination [Ctrl] + [Alt] + [Del] and select the "Task Manager". Switch to the "Performance" tab and click on "CPU" on the left. In the right area of the window you can now read the number of cores used by Windows under. Your quad-core processor should read "Cores: 4" there.

 

I would be happy to know if I could help.

 

Nikolino

I know I don't know anything (Socrates)

 

* Kindly Mark and Vote this reply if it helps please, as it will be beneficial to more Community members reading here

 

@NikolinoDE 

I have found that Microsoft broke the Worksheet calculation in 64bit versions later than 1902.  Stay away from xlCalculationAutomatic in newer versions of Excel you have large spreadsheets or spreadsheets in which cells are periodically changed.   You must turn off calculation (Application.Calculation = xlCalculationManual) an fire the calculation of sheet only when you need to otherwise risk the total crash of Excel.   I have spent a considerable amount of time and headache, including a ticket with Microsoft until I finally found that Microsoft updates specifically 2012 and 2101 have been totally screwed up!

 

I have about 15 spreadsheet VBA applications running at customer facilities that poll and populate sheets with operational data through a Modbus communication active X (MBAXP), all running anywhere between versions 1708 to 1902, these applications running flawlessly for weeks on end no problem whatsoever.  Then the last two applications I ran into a huge problem.  The same application would hang as soon as it started, I couldn't select any cell it would run for about 10 minutes then the darn Excel would crash and restart with recovered versions of the same application, the recovered application having no difference.  I originally thought it was a computer problem until I started experimenting with other versions Excel!  I downgraded to earlier versions and found that all was good, polling selecting was normal.  Then I experimented with the newer version, specifically 2012, thinking it could be the MBAXP.  After replacing the MBAXP with a timer and a random number generated sheet populating macro I found that the hangup was the sheet calculation, timing the difference between 2012 (2101 was a little better) and the version 1902 at sometimes over 50 times as long in the new versions!!  I had to limit the timer for - again - after about 10 minutes the darn Excel would totally crash and restart. No one at Microsoft (or anywhere else) seems to know why there is such a huge loss in calculation speed.  I can only guess the programmers at Microsoft don't know what they are doing.  While I finally solve the mystery, I am left to wonder whether Microsoft is intent on screwing up the product, leaving me to consider alternatives.  Its quite a bummer.   BTW the following link is a step by step instruction on downgrading to a working Excel version.  Try v1902 it works.  https://answers.microsoft.com/en-us/msoffice/forum/msoffice_install-mso_win10-mso_o365b/revert-to-a-...

@rpbenz 

Everything I could find in microsoft's quiver was sent to you as links. As i'm just a simple member, i can't help anymore.

If this information does not help you, contact Microsoft support.

 

Excel performance: Improving calculation performance

 

Excel performance: Performance and limit improvements

 

Excel performance: Tips for optimizing performance obstructions

 

How to clean up an Excel workbook so that it uses less memory

(Check whether the workbooks were created with earlier Excel versions, if so, change this, maybe this could help).

 

Known issues with Office and Windows 10

 

Thank you for your patience and time.

 

Hope I could help you at least a little.

 

I wish you continued success with Excel...Don't forget, Excel is the best invention since there was chocolate :)).

 

Nikolino

I know I don't know anything (Socrates)

 

Thank you @NikolinoDE there is a lot to go through there but I will and I will feed back to you.

 

I am absolutely blown away by how helpful everyone here has been.  I really didn't expect so many replies.  Lots for me to review and try out and investigate.

 

I will certainly share the hardware spec with my manager to see what she can do for me.  In the meantime I will make a start on reviewing my master workbooks to see where I can follow all the tips and cut out any unnecessary tasks for excel to do.  I've promised to share with my colleagues to as there are other in my team having similar problems.

 

I will go through your list item by item and make a note of any improvements.  Though it may take me a week or two because i'll have to do it all in between work but there will be opportunities as I am going along.

 

Thank you so much for taking the time to share.  I've struggled for two years and there have been tears and tantrums along the way so I am excited to get a bit of control back.

 

Allison 

@Al_Dyas 

By the way, on which edition of Excel you are?

Thank you @rpbenz.  Noted about turning off automatic calculation, I will make more use of this than I am currently doing.

 

I also get a lot of 'Excel not responding' messages and am just left sitting and waiting, or putting a wash on, making a brew, doing a bit of deadheading in the garden (summer obviously).

 

I am not sure i'll be able to swap between excel versions as I am on my employers network which is very restricted but I have lots of tips to work through and if they don't speed things up for me I will certainly be asking the question of our IT department.  I do find 2010 64 bit more problematic and not helpful in that I can't have multiple windows open.

 

Thank you for sharing, its all helpful and informative and I am grateful

Allison

 

Bloomin eck@NikolinoDE that is brilliant.  I will take a look and let you know how I get on.

 

That must have taken a while so thank you for your time.

 

I'd better get crackin.

 

Allison

Hi @SergeiBaklan 

 

So looking in About MS Excel it says:

MS Excel 2013 (15.0.5311.1000) MSO (15.0.5275.100) 32-bit

No idea what the bits in brackets mean but this is the version I use regularly through Citrix.

 

I also have access to:

MS Excel 2010

version 14.0.6023.1000 (64-bit)

I was given access to this version to help with working with the larger spreadsheets but it doesn't and I can't have multiple windows open so I tend to use 2013 most of the time.

 

I hope this answers your question.

 

Allison

 

 

Look I use to have the same problem with a slow Lenovo with amd processor. One day I decide to change the hdd with an ssd and surprise! A laptop with 8g of mem was slower than a 20 year old laptop. You change ssd , do little research on the net and you tube create live usb reinstall win and you will be pleasantly surprised!