Mar 03 2024 05:52 AM - edited Mar 03 2024 05:54 AM
The default action when one clicks a cell whilst inputting a formula is for Excel to return a relative reference. That allows a user of legacy Excel to address the members of a list or array by dragging the formula down or across the sheet for such distance as they determine appropriate using their judgement (a prolific source of error?).
Since Excel 365 is founded on the concept of arrays as fundamental computational objects, wouldn't it be more logical to expect the developer to select the array they wish to process rather than thinking about single elements. I find it difficult to persuade users that dragging a formula is a bad idea; it is deeply embedded in muscle memory and changing the habits of a lifetime is hard to do.
I propose that the default action for 365 should be to generate an absolute reference (and the user then encouraged to apply a defined name to the array they have selected to describe its business relevance?)
What are your thoughts?
Mar 03 2024 11:03 AM - edited Mar 03 2024 11:03 AM
I think this is a good idea for an option that could be enabled in Options | Formulas. I approach a given sheet with the mindset of how can I get it done in 1 move with a dynamic array? I'm totally on board with Excel 365 as an array-based powerhouse capable of so much more than the scalar-based legacy versions of the past. It's fun to use Excel to solve linear programming puzzles! There's a certain art to authoring elegant solutions in Excel that has really come on in the last couple years that I enjoy.
I think there's still a certain tendency to underestimate what Excel 365 is capable of and make a bee line for vba solutions. LET/LAMBDA fundamentally changed how formulas can be authored but it's very much a takes one to know one club at present. One doesn't often stumble into LET/LAMBDA by accident. I don't mean to be a gatekeeper but it's usually something someone takes up after years of study. For this reason, I think the proposed change could be added as an option to be enabled but defaulting it might cause a stir!
Mar 03 2024 11:44 AM
I fully agree with @Patrick2788 , any change in default behaviour which was for ages shall be done by option only. By option in user interface available for average user.
At the same time I won't be surprised if such option exists in registry.
Mar 03 2024 02:44 PM
"defaulting it might cause a stir"
Yes, but that is at the core of the argument. Users will follow the path that is most easily discovered. For me, that path leads firmly in the wrong direction, enticing users to follow. The scalar operations do lead to a very direct sense of the user manually interacting with their numbers. That suits many users, but they should receive some warning/indication that they are opting for legacy methods that deliver little in the context of modern Excel.
They will come to think of being limited to scalar arithmetic as representing 'simplicity' yet, in reality, the inherent complexity of the problem has simply moved to the task of wading through a sea of semantically limited cell formulas trying to divine the path of the overall calculation. Such users may seek more knowledge from the 'tips and tricks' genre of multiply plagiarised content, perhaps thinking that memorising over 350 shortcuts will in some mysterious way add wisdom to the proceedings.
With dynamic arrays, I look for 'simplicity' not in the individual scalar operations but in the overall patterns of array calculation. Operations like sorting, filtering, wrapping, grouping emerge along with the concept of arrays; they are not meaningful at the level of scalar thinking. Many problems, such as that of banded bonuses given to salesmen or taxation bands (attached) have array solutions that follow a different path from that adopted by fill-down formulas.
This process of abstraction, in which the details of a calculation are left to Excel, is a key element in the fight to control the error rates that characterise spreadsheet solutions and give the spreadsheet such a poor reputation. Overall, I would not suggest attempting to force users to adopt modern methods (that would simply drive many casual users away) but neither do I want to signpost a default route that ultimately leads to a dead end.
Mar 04 2024 04:03 AM
Mar 04 2024 06:21 AM
@PeterBartholomew1 My two cents...
I'm more of the attitude that if the situation calls for it, use it. If not, just keep it simple. This is especially true when helping out beginners. If I were to hazard a guess, I would say that the "average" user falls into this category. Intermediate users would have a decent chance at adapting, but people who are still using =SUM(A2+A3) formulas are not very likely to grasp major concept changes. I still feel the "default" behavior should be the path of least resistance, so no one gets left behind.
On the topic of defining names, I say each to their own. For me, less is more (in general). I find spreadsheets that were created with excessive, unnecessary named ranges to be harder to modify, even if I was the one who built it. Too much time is wasted jumping back and forth between the worksheet formulas and Name Manager trying to reverse engineer everything. I would much rather see the cell references directly in the formula, so I know exactly what they're referring to. Having said that, there are plenty of situations where named ranges are essential, and the majority of my spreadsheets have at least a few.
Further to that, when it comes to defining names within the LET function, I'm all for defining each variable to improve readability, especially when first introducing others to LET. However, it's easy to get carried away here too. One thing I'll probably never get on board with is defining the LAMBDA function first for BYROW, BYCOL, MAP, REDUCE, SCAN, etc... it's like putting the cart before the horse, and actually reduces readability in my opinion, as you end up having to backtrack to figure out what's happening, rather than just reading each step in logical order.
I often find myself suggesting solutions on this forum that I would never use personally, simply because I would never setup my spreadsheets in a manner that calls for them. Rather than trying to explain the pitfalls of certain design choices, I just end up answering the question directly, even if it goes against my own preferred methods. The beauty of Excel is that there are many ways to get the job done. Some methods are easier to understand, while others can be more efficient. In many cases, though, it boils down to personal preference.
By the way, the custom TaxCalcλ function you shared in the TaxBands file is awesome. It's so simple and yet beyond the comprehension of any normal person (looks like Thunks was used?). You sir, are in a league of your own. 🙂
Mar 04 2024 07:02 AM
It's interesting you mention defined names which have been used more since the advent of LET/LAMBDA. I agree that sometimes there can be too many defined names and it becomes difficult to manage.
There's a long forgotten feature that was removed with Excel 2007 because it was seldom used: Natural Language Formulas.
This is from the wayback machine:
Microsoft Excel 2007 (nee Excel 12) : Deprecated features for Excel 2007 (archive.org)
Natural Language Formulas
This feature allowed people to use the labels of columns and rows on worksheets to refer to the cells adjacent to those cells without explicitly defining them as names. This feature has been disabled by default since Excel 2000 based on customer feedback. In Excel 12 we will completely remove this seldom-used feature from the product. When opening files in Excel 2007 that use this feature, formulas will be converted to use cell references.
Note: This feature could be found in the Excel 11 options dialog with the title “Accept labels in formulas”
Insert Name Label
This command allowed people to more explicitly specify labels for Natural Language Formulas and is being removed as part of that feature.
Maybe I'm misremembering this feature but it sounds like this feature would be a lot more useful in 2024 than in 2003.
Mar 04 2024 02:51 PM
I guess it all depends on how one defines 'simplicity'. Whereas others see simplicity in a mass of disconnected formulas because they individually involve no more than primary school arithmetic, I regard that to be an unstructured and complex mess of relationships. Simplicity, for me, is achieved when many disconnected observations can be derived from one guiding principle (in this situation, a single array formula). I accept that it is me that is weird in this respect, an abstract thinker that studied mathematics, theoretical physics and astronomy (over half a century ago) before going on to work as a research scientist in aeronautics. There was a time when I had probably written more matrix formulas than natural language sentences!
Spreadsheets are more about doing than thinking. I wonder how many spreadsheet users could offer a coherent explanation of what is achieved by relative referencing and why the same formula, placed in different locations on the worksheet, give different answers. I haven't used relative references since 2015 , 1704.01142.pdf (eusprig.org), and by now things that spreadsheet users find normal, I see as odd. When I look at a term in a formula, I do not need to know 'where did that come from?'; I want to know 'what does that represent?' The A1 notation tells me the wrong thing; a name just might tell be what I need to know to evaluate the formula.
One of the hardest things I find, is to persuade an experienced Excel user to stop thinking in terms of 'and fill down' and see instead, a single array which may be reordered, reshaped or aggregated to achieve the desired results. There are times when it would be easier working with a complete novice who is free from the expectations of traditional spreadsheeting. Interestingly, though, this far the responses have been from individuals who are comfortable with traditional spreadsheeting but, at the same time, have fully mastered the opportunities presented by the new. You clearly beat me on versatility!
Mar 05 2024 11:28 AM
Sadly, conditional formatting is full of gimmicks (admittedly, some rather neat) but the basics are antiquated. All I need is to be able to do is apply formatting based upon an array formula to its spill range, given a user-specified anchor cell. The formula should be capable of including range intersection operators. In other words, all I require is what is missing!
It is my understanding, though, that the CF code base is so old that it would require a massive effort to bring it into the 21st century.
Mar 06 2024 02:00 PM
Interesting. I have seen add-ins along those lines but I didn't know Microsoft had ever tried to change end-user perception in that way. The spreadsheet user base appears to be very resistant to supposed 'improved' practice!
For me, it was always to opposite; I would simply 'glaze over' at formulas such as
= SUMIFS(Trading!DK5:DK12025, Trading!DI5:DI12025, "banana", Trading!DJ5:DJ12025, "Japan")
I don't think I even read the references after the first few; it was just encrypted 'stuff' that I couldn't be bothered to think about. My only strategy for maintaining attention was to replace the direct references by names derived from any adjacent annotation (assuming there was any). That way I might get to see
= SUMIFS(revenue, country, "banana", product, "Japan")
at which point even I might begin to suspect that all is not right with the formula. By the time I had understood the workbook, I had completely refactored it and then had the embarrassment of confessing to the author that I no longer had a copy of their workbook but that I had found some errors along the way!
Mar 06 2024 02:37 PM
If you liked TaxCalcλ, try its big brother
= FIFO.GenerateTableλ()
As it stands, the lack of any parameters would cause it to error but following @Craig Hatmaker 's ideas on Component-Based Software Engineering, eusprig-proceedings-2023.pdf (pp 33-58), it actually returns
With the parameters correctly input and with option=2, the result might be of the form
[shows costs pivoted by output (rows) and input (columns)]
I have also used the same formula for a challenge posted by Ian Hewitt on Chandoo a few year back with involved the Fe content of ore mined and supplied on a FIFO basis.
At its core, the idea of the calculation is the same as that used for the tax bands. That is stack two arrays including cumulative quantities and sort in ascending order. Then take the difference to assign quantities to individual tranches.
Mar 06 2024 08:40 PM
@PeterBartholomew1 That’s pretty cool too. There’s a lot more to it than TaxCalcλ, so I’ll have to really break it down and see if I can understand what’s going on.
In reference to your SUMIFS example, yes, the named ranges quickly reveal the error. That’s the type of situation where a structured table is the way to go. Admittedly, SUMIFS has always been one of my favorite Excel functions, and even more so now with MS365 as it can easily spill a 2D array of results when using both horizontal and vertical arrays as the criteria. Performance was sometimes an issue with SUMIFS in older versions of Excel, but not so much anymore (unless you’re dealing with huge amounts of data, in which case PQ can be used to reduce the dataset accordingly).
That’s very interesting to hear a bit about your background. I figured it was something technical, but had you pegged as being an ex-computer programmer or engineer of sorts. My background is considerably less impressive, lol. I studied ornamental horticulture, majoring in turfgrass management, but found it difficult to make a living in Canada, as the golf season is only 5 and a half to 6 months long! After a few years of that, I put my math and computer skills to use and found a more stable job as a data entry clerk at an accounting office, which is where I developed my love for Excel. John Walkenbach’s books were a great resource at the time, and really opened my eyes to what Excel was capable of.
I made the jump from Office 2010 to MS365 around 2 years ago out of necessity. Personally, Office 2010 had everything I needed, but clients I previously built spreadsheets for were migrating from Sage 50 to Xero and needed integration with newer add-ins only available in MS365. The first spilled array formula I encountered was when the SCOTT.CHART function was released in mid-late 2022 (as part of Scott’s Add-Ins for Xero) and my initial reaction was “meh… a table would be better”. It intrigued me, though, and after a bit of research online, as well as watching a few YouTube videos, I saw the potential, but still didn’t have very many real-world scenarios to play around with. Joining the discussion here in October 2023 really helped me out in that department, and it’s been a great place to learn, experiment and develop a better understanding of the new dynamic array functions.
Tying all of that back to the topic at hand, I think part of the reason why array functions come naturally for people like you and me, as well as the other contributors here, is the way our minds work. There’s a certain amount of logic required... you need the ability to visualize what the results of an array will look like in order to build a complex, single-cell formula (especially when dealing with nested arrays). This can be learned and achieved through experience but will inevitably come more easily for a small percentage of users. That’s not to say relative vs absolute vs mixed references aren’t a source of confusion… they absolutely are. But I think they’re still easier to learn and understand for the average user. 😉
Mar 08 2024 06:06 PM
@Patrick2788
"defaulting it might cause a stir"
Please take this in the spirit of humor. Your point is very valid, but a bit ironic given the topic and how enabling dynamic arrays required a major change to Excel's calculation engine default behavior.
Mar 08 2024 06:32 PM
Thanks for the shout out. Those functions look great so I guess it's time to package your LAMBDAs into 5g libraries. BTW - 5g is on fire. In December I presented 5g at the University of Arizona's MECC championship. Last month Danielle Stein Fairhurst presented 5g functions a the Excel Global Summit. Model Citizn just posted an article on 5g. All next week I have 5g workshops for Full Stack Modeller. In April I'm at the University of Tennessee. In May I'm presenting at FMI (Financial Modelling Institute). Finomatic Consultancy is converting their models to 5g... and this all traces back to when you introduced me to your work with array formulas, what, 10 years ago? Thanks for all your help Peter.
Mar 09 2024 04:52 AM - edited Mar 09 2024 06:22 AM
Thank you for taking the time to read my reply and respond.
I do believe the introduction of dynamic arrays marked a turning point in the history of Excel - a major change in the calculation engine for the better. I can only speak for myself. Colleagues of mine that have used Excel since the 80s and 90 have not completely bought into Excel-based-on-arrays for lack of time to learn new skills, being content in one's current skill set, not being interested, etc. - whatever the reason may be. I still hear of columns being re-arranged to accommodate the limits of a base VLOOKUP, Advanced Filter being used instead of UNIQUE, helper columns used instead of LET, and plenty of other examples. The awareness of functions like FILTER, XMATCH, LET, CHOOSECOLS, CHOOSEROWS, TAKE, DROP is close to 0.
Clients are usually very open to dynamic arrays. Lambdas are usually received very well. Though I still have to talk people off the VBA ledge. There's a certain mindset that if someone doesn't know how to do it in the application that VBA must be the answer. I think anyone beginning their studies of Excel in 2024 would be wise to find a course specifically geared towards Excel 365 and stay away from macros!
Mar 09 2024 03:09 PM
10 years? It must be about that, the FAST Standard was launched in 2014 during a meeting at the ICAEW. As a heavy user of both defined names and CSE arrays I queried features of the standard but it was not the occasion to follow up the discussion. In the following year I pretty much trained myself to commit all formulas with CSE and I think we were sharing ideas well before EuSpRIG 2015 (not that CSE arrays are that pleasant to work with). I also got invitations to ModelOff 2017 and 2018 Bootcamps and pestered Microsoft staff to improve their array offering in much the same manner as Tables had done for list data.
The complete rework of the Excel calculation engine, when it came, was far more than I had expected and has led us to the point where your pre-built modules are even possible. I may need some guidance from you on testing and publication. The algorithms that interest me most at present are those that use array shaping, sorting and filtering as key elements of the solution. Without arrays as the fundamental building block one could be left floundering in a sea of helper cells.
If you are giving any presentations that I could listen in on, could you send me a link/invite?
Mar 09 2024 03:26 PM
"a course specifically geared towards Excel 365"
Have you identified such a course? In some ways that is what this discussion was intended to cover. It is always possible to bolt Dynamic Arrays, LET and LAMBDA on to an existing course as 'new', 'advanced' topics. Then one might need to debate whether Lambda should wait until Power Query and Pivot Tables have been covered.
What I would argue for is that DA, LET and LAMBDA should not be left to the final steps of training as advanced topics but, rather, treated up front as the basis of modern Excel; it is traditional spreadsheet techniques that should be left to the end for those who need to address issues of backward compatibility.
Mar 09 2024 03:44 PM - edited Mar 09 2024 03:47 PM
I'm mostly an auto-didactic when it comes to Excel. I learn through creating challenges for myself and have learned from participating in this community.
I haven't actively searched for decent Excel 365 courses online so it's difficult to recommend one. I've seen some videos from some of the popular instructors and there's usually something that puts me off about their methods. Mostly it's the lack of LET being taught. Someone had linked a VSTACK video on this forum a few weeks ago. The instructor was stepping viewers through a 3D stack. The finished formula contained explicit references repeated a few times and no use of LET. To top it off, he was charging a few dollars for the template with the formula!
What I would argue for is that DA, LET and LAMBDA should not be left to the final steps of training as advanced topics but, rather, treated up front as the basis of modern Excel; it is traditional spreadsheet techniques that should be left to the end for those who need to address issues of backward compatibility.
I agree with DA and I've come around on LET. In the 40 hour Excel 365 introduction course I teach within my organization I debated including LET and had left it out the last time I ran the course. Next time I teach it I'm going to include it.
Mar 09 2024 07:49 PM
Mar 10 2024 12:26 AM
@Patrick2788 I vaguely remember seeing something in the Excel 2003 menus referring to "Name Labels" but didn't know what it meant or how to use it, as I was still pretty green at the time. I wonder how dynamic that feature actually was... did it auto-expand the range references as the dataset increased? If so, then yes, that seems like a great feature for Microsoft to revamp/revisit in MS365. Current features like "Create from Selection" can be used to quickly define column names, for example, based on values in the top row, but the resulting range references are static, making it a useless feature for dynamic arrays.
Just out of curiosity, and in the spirit of additional discussion, what is the main source of your aversion to VBA? Is it primarily security concerns? I get it... macro-enabled workbooks can contain malicious code, and bad actors can use macro viruses to inject malicious code into existing documents. However, malware and viruses like these don't just appear on their own. They are typically "let in" unwittingly by the end user via phishing emails, network shares, downloading files from untrusted sources, etc. Much of the risk can be minimized by adopting secure network protocols (ie: firewalls, client endpoint protection, etc.) and safe document handling practices. If the general recommendation is to avoid VBA at all costs because of the security risks, one might argue that you should also stop using email altogether, and/or stop browsing the internet (go completely off grid), as the same risks apply to non-VBA specific malware and viruses.
Personally, I don't view VBA as a crutch to lean on, but rather as an enhancement tool if the situation calls for it. Newer tools such as Power Query and Power Automate may be more secure and more appropriate in certain situations, but from what I've read/experienced, can be limited in their scope and availability. VBA is a far more versatile and flexible option than the current alternatives. I'd be hard pressed to recreate many of my macro-enabled workbooks with the same level of automation and efficiency via alternative tools, if Microsoft were to remove VBA as an option in the future (and I don't think I'm alone in that respect... I think it would cause more of an uproar than making dynamic arrays the "default" behavior, lol).
Lastly, on the topic of self-education, some of the YouTube channels I would recommend to anyone stumbling across this discussion include ExcelIsFun (Mike Girvin), Leila Gharani (xelplus), MyOnlineTrainingHub (Mynda Treacy) and Excel Macro Mastery (Paul Kelly). There a plenty of other great channels out there, but these ones are at the top of my list for the quality of the content they put out. Cheers!