Forum Discussion
Formula speed / Laptop specification
Hello
I am a regular user of the index match formula between spreadsheets containing over 400,000 records and this can take up to 30 minutes to complete. On smaller spreadsheets the formula regularly takes 5 to 10 minutes. I have tried a variety of ways to improve the speed:
working on my hard drive instead of my work network;
always pasting values;
turn on manual calculation;
look for ways of cutting down the data I need to analyse.
Our IT department tell me I have the maximum ram on my laptop which is a core i5 elitebook. they loaned me a notebook, the highest spec they provide, but it made little to no difference so I didn't think it was worth the expense. No one in IT seems to be able to direct me to a solution other than to tell me to use other software but then can't tell me what that is other than Tableau which just wouldn't work for me or my department. On top of the Excel 13 I already had they have me Excel 2010 64 but this just causes issues and sometimes the formula doesn't recognise i'm working between 2 spreadsheets so I have to close and reopen.
I am not well up on laptop specifications so can't tell you exactly what spec the notebook was. they even gave me a virtual desktop which made no difference at all.
The CPU on my laptop occasionally reaches 99% but even when it doesn't it is extremely slow during calculations.
I am currently working from home too but to be fair I had the same problem when I worked in the office and its been going on for 2 years. We use citrix but am more than happy to work of the hard drive if the formula worked faster, which it currently doesn't.
Is there anyone who can advise me on the spec of laptop that I would need to quickly carry out such a formula or am I destined to sit watching the % on my status bar crawl to 100% for the rest of my days? Is this what other people have to deal with?
All advice/direction is welcomed.
Allison
18 Replies
- Jimmy23238Copper Contributor
Look I use to have the same problem with a slow Lenovo with amd processor. One day I decide to change the hdd with an ssd and surprise! A laptop with 8g of mem was slower than a 20 year old laptop. You change ssd , do little research on the net and you tube create live usb reinstall win and you will be pleasantly surprised!
- NikolinoDEGold Contributor
Reasons for slow computation and poor performance
Volatile functions slow your performance
The term volatile, simply put, means that these formulas are always recalculated.
Conditional formatting makes your file slow
Many complex formulas degrade your performance
Incorrect or too large references lead to a loss of speed
Large workbooks result in slower worksheets
Increase the speed and improve the performance
Save your file as a binary workbook for better performance
Use pivot tables
Use conditional formatting carefully for better performance
Avoid volatile functions for more performance
Customize the reference areas and use structured references and names
Improve your formula structure for more speed
Avoid matrix formulas in your files
Replace formulas with fixed values
Avoid links between workbooks
Sort your source data for faster calculations
Use VBA for better performance
Switch to manual calculation
Can all workbooks be made faster?
Check links in your file
Remove incorrect or unnecessary formulas
Delete empty cells from memory
Check if you are using volatile functions
Check the references in your formulas
Check your formulas and conditional formatting
Finally, I can only recommend you, if none of the above steps help then you should read this article.
These go into detail on the topic and offer alternative paths.
The articles are sometimes a bit very technical and best only for advanced users but still very helpful.
Excel performance: Improving calculation performance
Excel performance: Tips for optimizing performance obstructions
Hardware: I recommend an SSD hard drive and at least 16GB RAM (when it comes to that many data sets).
At the same time check whether all processors are activated.
Instructions for the check (example with processor i7):
Press the key combination [Ctrl] + [Alt] + [Del] and select the "Task Manager". Switch to the "Performance" tab and click on "CPU" on the left. In the right area of the window you can now read the number of cores used by Windows under. Your quad-core processor should read "Cores: 4" there.
I would be happy to know if I could help.
Nikolino
I know I don't know anything (Socrates)
* Kindly Mark and Vote this reply if it helps please, as it will be beneficial to more Community members reading here
- Al_DyasCopper Contributor
Thank you NikolinoDE there is a lot to go through there but I will and I will feed back to you.
I am absolutely blown away by how helpful everyone here has been. I really didn't expect so many replies. Lots for me to review and try out and investigate.
I will certainly share the hardware spec with my manager to see what she can do for me. In the meantime I will make a start on reviewing my master workbooks to see where I can follow all the tips and cut out any unnecessary tasks for excel to do. I've promised to share with my colleagues to as there are other in my team having similar problems.
I will go through your list item by item and make a note of any improvements. Though it may take me a week or two because i'll have to do it all in between work but there will be opportunities as I am going along.
Thank you so much for taking the time to share. I've struggled for two years and there have been tears and tantrums along the way so I am excited to get a bit of control back.
Allison
- SergeiBaklanDiamond Contributor
By the way, on which edition of Excel you are?
- rpbenzCopper Contributor
I have found that Microsoft broke the Worksheet calculation in 64bit versions later than 1902. Stay away from xlCalculationAutomatic in newer versions of Excel you have large spreadsheets or spreadsheets in which cells are periodically changed. You must turn off calculation (Application.Calculation = xlCalculationManual) an fire the calculation of sheet only when you need to otherwise risk the total crash of Excel. I have spent a considerable amount of time and headache, including a ticket with Microsoft until I finally found that Microsoft updates specifically 2012 and 2101 have been totally screwed up!
I have about 15 spreadsheet VBA applications running at customer facilities that poll and populate sheets with operational data through a Modbus communication active X (MBAXP), all running anywhere between versions 1708 to 1902, these applications running flawlessly for weeks on end no problem whatsoever. Then the last two applications I ran into a huge problem. The same application would hang as soon as it started, I couldn't select any cell it would run for about 10 minutes then the darn Excel would crash and restart with recovered versions of the same application, the recovered application having no difference. I originally thought it was a computer problem until I started experimenting with other versions Excel! I downgraded to earlier versions and found that all was good, polling selecting was normal. Then I experimented with the newer version, specifically 2012, thinking it could be the MBAXP. After replacing the MBAXP with a timer and a random number generated sheet populating macro I found that the hangup was the sheet calculation, timing the difference between 2012 (2101 was a little better) and the version 1902 at sometimes over 50 times as long in the new versions!! I had to limit the timer for - again - after about 10 minutes the darn Excel would totally crash and restart. No one at Microsoft (or anywhere else) seems to know why there is such a huge loss in calculation speed. I can only guess the programmers at Microsoft don't know what they are doing. While I finally solve the mystery, I am left to wonder whether Microsoft is intent on screwing up the product, leaving me to consider alternatives. Its quite a bummer. BTW the following link is a step by step instruction on downgrading to a working Excel version. Try v1902 it works. https://answers.microsoft.com/en-us/msoffice/forum/msoffice_install-mso_win10-mso_o365b/revert-to-a-previous-version-of-office-365/79138c95-80b4-48fb-bef7-08619af3593e
- Al_DyasCopper Contributor
Thank you rpbenz. Noted about turning off automatic calculation, I will make more use of this than I am currently doing.
I also get a lot of 'Excel not responding' messages and am just left sitting and waiting, or putting a wash on, making a brew, doing a bit of deadheading in the garden (summer obviously).
I am not sure i'll be able to swap between excel versions as I am on my employers network which is very restricted but I have lots of tips to work through and if they don't speed things up for me I will certainly be asking the question of our IT department. I do find 2010 64 bit more problematic and not helpful in that I can't have multiple windows open.
Thank you for sharing, its all helpful and informative and I am grateful
Allison
- JMB17Bronze ContributorExact match lookups can take a very long time with large spreadsheets. You might consider an alternate design, perhaps a database or using power query as suggested.
Or, if your data can be sorted on the column containing your key, then you might make use of the approximate match option. For example, say the table is in A2:B8 and the value I'm trying to match is in D2, then use an IF statement to return the value it actually matched to and confirm it is an exact match and, if so, return the value I want out of Column B. Approximate match lookups used in this fashion (binary lookup) are exponentially faster on large data sets.
=IF(INDEX(A2:A9,MATCH(D2,A2:A9,1))=D2,INDEX(B2:B9,MATCH(D2,A2:A9,1)),0)- Al_DyasCopper Contributor
Thank youJMB17 I think I am starting to realise that I need to use 2 formula instead of 1.
I hadn't made in clear in my initial post that I am always working between 2 files. First to see if the record exists in both files and second the return an exact bit of data from each record. So i'm asking for 2 outcomes and i'm begging to think I should be using 2 different formulae?
I will definitely look at power queries, as I said in my last reply, I've never heard of these but will do some digging.
I work pretty much alone so no one to bounce ideas off so I am very grateful for the replies I have received.
Allison
- JMB17Bronze Contributor
I would use two columns to make the formulas shorter/easier to follow. Also, if you are returning multiples columns, you only have to test once (the other formulas can just reference the cell that performs the test).
If your table in 20201109 MOSL_MDS_SSPID.xlsx can be sorted in ascending order on Column A, then you could try:
Test for existence (let's say this formula is in cell C2, note the match option is 1 and not 0):
=INDEX('20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,MATCH(B2,'20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,1))=B2
Then test C2 and return the value from Column I, or return "Doesn't Exist" (again - note the match option is 1):
=IF(C2,INDEX('20201109 MOSL_MDS_SSPID.xlsx'!$I2:$I229079,MATCH(B2,'20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,1)),"Doesn't Exist")
The key is that the data in your table MUST be able to be sorted in ascending order on your key column (in this case, Column A). And, since we are using an approximate match, you must perform the lookup twice - first to confirm the key it matched to is the key you are trying lookup. Then, if the key matches, perform the lookup and return the value you actually want (which makes the approximate match lookup an exact match lookup). Because the approximate match lookup method is much more efficient, you can speed up the calculation in spite of the fact that you are performing the lookup twice.
There's a short explanation here, but you can find more information if you google 'excel binary lookup.'
https://www.powerusersoftwares.com/post/2016/12/20/make-your-vlookup-run-a-hundred-times-faster-with-the-double-vlookup-trick
- Riny_van_EekelenPlatinum Contributor
Al_Dyas You haven't really explained what you are indexing and matching. Comparing/matching such large data sets could perhaps be done with Power Query. It allows you to connect to the data without loading it into an Excel sheet. Not sure, though, if this could work you.
- Al_DyasCopper Contributor
Thank you Riny_van_Eekelen. I have no idea what a power query is so I will find out and try that.
A sample of the formula I use is this:
=INDEX('20201109 MOSL_MDS_SSPID.xlsx'!$I2:$I229079,MATCH(B2,'20201109 MOSL_MDS_SSPID.xlsx'!$A2:$A229079,0))
There are many different scenarios but I use the same formula structure for the majority.
From a 2016 file I am looking at a 2019 file to see how the data changed, or if it has been removed. For example the MATCH is looking for the exact sewerage point id (SPID) reference numbers in both files. The index returns the status of the SPID in the 2019 file.
So the index match formula answer 2 question for me, is the SPID still active and if it is what is the status.
In another scenario: I have carried out a process to batch update the status field of a large number of SPIDs overnight and the following day I run an index match against today's report and yesterday's report to check that the update worked. So again I am looking to see if the status of the SPID has changed as it should have.
I hope this helps and makes sense.