Forum Discussion
How - and when - does the accumulator of SCAN get reset?
This was inspired https://www.linkedin.com/posts/global-excel-summit_fill-blanks-with-value-from-above-activity-7279702477438955520-EysR?utm_source=share&utm_medium=member_desktop; I figure that this is a better forum to discuss the issue I would like to raise.
In a nutshell, the assignment discussed there was to fill the blanks in a "gappy" column by repeating each visible value enough times to populate the empty cells beneath each visible value until the next visible value is reached and then continue repeating using that next value.
The answer (as modified by PeterBartholomew1) in that discussion was:
=SCAN("", names, LAMBDA(accumulator, current, IF(current <> "", current, accumulator)))
In my attached file, "names" refers to the list of names in A1:A10 (that is, including the otherwise blank A10).
A couple of things I don't understand:
First (and probably simpler) is the function of the initial value of the accumulator. Whether it's "" or "xxx" (which I used in the attached file) the output stays the same. So why is the accumulator necessary to begin with? Or does it imply that - for that particular purpose - SCAN is not the right approach?
Second, whether the initial accumulator value is "" or "xxx", it somehow gets reset as we go down the column and its value becomes equal to the value of the most recent visible row cell; it doesn't "accumulate" - which to my (obviously over-simplistic) mind would imply some sort of concatenation of all previously visible row cells.
To demonstrate, I changed the formula to just
=SCAN("xxx",names,LAMBDA(a,c,"a is now "&a))
and the output became
In this case, there is some form of "accumulation" - but even that doesn't take the form I would expect.
As the title says: how and when does the accumulator gets reset? That's an issue I ran into in various other scenarios so I would like to understand it once and for all, if possible...
I think your question is really about how the accumulator in SCAN works and the answer is that it is NOT a good term at all. The "accumulator" is really just the previous resulting value and althought the term "accumulator" is technically correct, it can be confusing. I prefer to use "p" and "q" as in SCAN( _init, _array, LAMBDA( p, q, ...)) because "p" is the prior or previous value (and on the first iteration it will use the _init value you give it) and the "q" is the qth value in the _array. so it will iterate through the _array and at the end give a final answer (and because it is SCAN instead of REDUCE it will also give you every iteration on the way) based on the formula you give. The term "accumulator" is technically correct because when it goes into the 7th iteration it will pass the result from the 6th iteration but that result from the 6th iteration is technically the accumulated result from ALL the previous iterations. Hope that help.
5 Replies
- m_tarlerBronze Contributor
I think your question is really about how the accumulator in SCAN works and the answer is that it is NOT a good term at all. The "accumulator" is really just the previous resulting value and althought the term "accumulator" is technically correct, it can be confusing. I prefer to use "p" and "q" as in SCAN( _init, _array, LAMBDA( p, q, ...)) because "p" is the prior or previous value (and on the first iteration it will use the _init value you give it) and the "q" is the qth value in the _array. so it will iterate through the _array and at the end give a final answer (and because it is SCAN instead of REDUCE it will also give you every iteration on the way) based on the formula you give. The term "accumulator" is technically correct because when it goes into the 7th iteration it will pass the result from the 6th iteration but that result from the 6th iteration is technically the accumulated result from ALL the previous iterations. Hope that help.
- PeterBartholomew1Silver Contributor
I think the key point is that, though 'c' picks up a name from the 'names' column at each step, 'a' only takes its value from "xxx" for the initial step. From there on it sets its value from the result of the previous step. The variable 'c' references the input to the formula, whilst 'a' accumulates the results. Your example only uses 'names' as the iterator 'for each c in names'. At each step "a is now " is inserted at the start of the line so an array of text strings of increasing length is output.
In my formula "xxx" is available at step 1 but it is not used because the 'name' is not blank.
- Patrick2788Silver Contributor
In re: the first example with the names. The initial value supplied in SCAN is arbitrary because the first position is the array is filled.
I believe the initial value becomes useful for filling in gaps in the data when the data is a bit messier such as:
If we're to presume those first few blanks are to be filled with "John" then the initial value becomes useful because it gets the ball rolling:
=LET( initial, TAKE(TRIMRANGE(names2, 1), 1), SCAN(initial, names2, FillDownλ) )
With FillDownλ being the standard:
FillDownλ = LAMBDA(acc,v,IF(v="",acc,v))
- PeterBartholomew1Silver Contributor
Patrick2788 A minor variation might be
= LET( initial, TAKE(TRIMRANGE(names2, 3), -1), SCAN(initial, names2, FillDownλ) )
BTW I like your use of TRIMRANGE. I opposed the introduction of the function at first.