LAMBDA: PrettyPrint

Iron Contributor

I've built some LAMBDAs to try to pretty-print Excel formulas; this relies on the formula being valid, so we leverage the native Excel parser indirectly.

I have a base LAMBDA that creates a "naive" pretty-print: new lines for parens, curly brackets, and commas. Parameters are a cell reference or formula text and an indent spacing. It's naive because I don't attempt to determine if a bracket is in a text string wrapped in double-quotes (and probably for other reasons).

I have a MakePrettier LAMBDA that works on the output of PrettyPrint.Naive; it takes a left bracket, finds all substrings that are balanced with opening and closing brackets, cleans them of whitespace, and checks to see if the cleaned width is less than a user provided parameter, maxWidth. If so, we use another LAMBDA ReplaceStrings to replace them, then return the new formula string. We execute for both a curly braces, then on that output for parentheses. The expected outcome seems to be reliably yielded using this methodology.

 

However, when I make a single PrettyPrint LAMBDA, composing PrettyPrint.Naive and MakePrettier, my output is as if we had only execute MakePrettier for the first left bracket (in this case curly brackets). My question is why the different outcomes?

 

Hypotheses:

* scope confusion on variable names? Probably not.

* Resource constraints? Probably.

 

Excel strains when recalculating these formulas, possibly because of the size of my input-formula (an output from my UnpackLet experiment) which clocks in at over 1100 characters long. I played with a few shorter formulas and those matched using both implementations (try it with some of the LAMBDAs in the Name Manager, or you favorite long formula).

 

The thing that I find surprising is: if I'm hitting up on the resource limits LAMBDA is allowed, why return a value at all?

 

What have I tried to make it better? I've used both a recursive version of ReplaceStrings (included) and a non-recursive version. Doesn't affect the outcome.

 

Thoughts hive-mind?

 

@Peter Bartholomew @lori_m Y'all might find this amusing.

@Chris_GrossCome see what this lunatic is doing with your LAMBDAs.

4 Replies

@tboulden 

Wow!  That is crazy.  I would never have thought of using an Excel formula to format Excel formulas.  The later steps of regrouping array constants and inner parentheses is a key element of providing structure; little is gained by turning a horizontal mess into a vertical format, even with the indentation.  When laying formulas out by hand, I tend to try to have the complexity of any given line grow in proportion to the growth in the number of lines.

 

I did try to see what happens when I applied the pretty print function to its own definition but a LET function gains little because of the structure that has already been built into the formula manually.

 

I haven't got as far as looking why applying the formula in different ways gives different results.  I simply haven't got to grips with the formulas to the required depth.

Yes, I might revisit with a less naive approach, and special cases for LET/LAMBDA formulas; it really depends on what you find helps you understand the formula. I sometimes forget how far in I am when visually inspecting a formula by counting brackets. And the UnpackLet outputs were a perfect test case: they return valid/accurate results, but its formula-stew if ever I saw it.

@tboulden 

I love the term 'formula stew'.  I have decomposed structured formulas manually on occasion, just to see what I get.  It usually generates a mess but not always. 

 

Simplicity, like beauty, is very much in the eye of the beholder.  For many years now, I have settled on array formulas which, in turn, makes it easier to use defined names.  In fact Mike Girvin commented

"As for your COUNTIFS, I have never, in almost 3 decades, seen a spreadsheet like yours with ALL the calculations entered as arrays!!! Now I see why your are so excited by the new Excel Calculation Engine."

 

Earlier, before I had fully-committed to the pain of CSE, I had exchanged models with F1F9 (financial modelling company prominent in promoting the FAST Standard).  One of the sheets was returned with a 'simplified version', i.e. no array formulas, no names.  The question then becomes:

In what way do you believe that replacing 4 names and 2 formulas, aggregating data over 100 cells, with a (direct referencing) system that introduces 17 billion names (only 100 of which point to occupied cells and 50 individual formulas, represents 'simplification?

 

I have been known to argue the case that 'complexity' is no more than 'simplicity' in excessive quantity.  I would suggest that is only through the use of more abstract concepts that 'simplicity' is regained. 

I'm enjoying learning more about array formulas and dynamic arrays through my LET/LAMBDA experimentation; I may try an experiment like you say where all calcs are arrays, I know my coworkers would "love" it. I agree, 4 names and 2 formulas is much less worry if they are well-written, well-explained with comments or documentation, and well-protected so Joe Schmoe can't break it accidentally.