Forum Discussion
Adjust a formula to ignore hidden/filtered rows of data
Thanks for your quick response. I don't know how that happened - i compared the original formula with and without the initial value, and the result was the same... Probably, because i didn't see visible difference, i removed the initial value of the REDUCE function, and then started noticing that issue. Thanks! So.. How does that formula work? Is my understanding correct?
- Variable 'visible' is assigned its value, the result of the DROP function
- DROP removes 1st value from the result of the REDUCE function (which, in this case, must return an array of a sort)
- REDUCE has an initial value "" which becomes the 1st element of an array (range) 'a'
- Range to be reduced is the [Name] column of the table
- Lambda is the actual reduction formula:
- If a cell is hidden, that is, COUNT function returns 0 for the current array element 'v', unchanged result of the REDUCE function is returned (it contains "" during the very first iteration)
- Otherwise, 'v' is appended to the end of the result variable 'a'
- DROP removes 1st value from the result of the REDUCE function (which, in this case, must return an array of a sort)
- ROWS function is applied to the result of the UNIQUE function which returns a range of distinct values from 'visible'.
- Limitation of this formula: if there are no visible rows, we get #CALC! because Excel tries to apply UNIQUE and then ROWS to an empty array.
How can this formula be improved to show 0 when there are no visible rows?
...
I think i found the solution. It should be faster as well. We don't need to use DROP because, if there are no visible elements, we are deleting the only array element (initial value ""), and Excel returns an error. However, ROWS(DROP(array, 1)) is, in essence, the same as ROWS(array) - 1, which leads to this formula:
=LET(visible,
UNIQUE(REDUCE("",
[Name],
LAMBDA(a,
v,
IF(SUBTOTAL(103,v)=0,
a,
VSTACK(a,v)
)
)
)
),
ROWS(visible)-1)
g0ldfinch You seem to have a pretty good handle on the logic used with the REDUCE / VSTACK method. A more appropriate way to avoid the #CALC! error, though, with the UNIQUE / ROWS combination is to wrap it in the IFERROR function to return 0.
Personally, I wouldn't use REDUCE / VSTACK in this situation, because there are a number of other options that work well and perform better. SergeiBaklan already provided one such method using the MAP function with TOCOL. Another option is to simply use the FILTER function with BYROW in the include argument, as follows:
=IFERROR(ROWS(UNIQUE(FILTER([Name], BYROW([Name], LAMBDA(r, SUBTOTAL(3, r)))))), 0)
The SUBTOTAL function will return 1 for visible rows and 0 for hidden/blank rows, which are interpreted by the FILTER function as TRUE and FALSE. Cheers!
- g0ldfinchNov 29, 2023Copper Contributor
Thanksdjclements , that's cool! It's literally my first time encountering such things in Excel, but your example is quite easy to understand. It's much simpler than the VSTACK version. However, I would like to avoid use of IFERROR - just in case, for troubleshooting purposes. Errors are easier to spot than digging into the formula to find out why it's showing 0... Your version shows #CALC! at the FILTER step if no rows are visible (FILTER returns an empty array), hence the need for IFERROR. Neither UNIQUE nor ROWS will work with an empty array. Is there a way to test an array for emptiness? E.g., something like:
=LET(array, FILTER(names[Name], BYROW(names[Name], LAMBDA(r, SUBTOTAL(3, r)))), IF(<array is empty>, 0, ROWS(UNIQUE(ARRAY)))I understand that FILTER already returns a calculation error, but was wondering if something like the above is possible at all...
SergeiBaklan's version also is elegant - even i could understand it, with a bit of documentation read (: However, it also shows #CALC! if no rows are displayed... I'm afraid, using IFERROR is the only option to return 0 in both your variants. VSTACK version is free from this limitation, but it's waaaaay less performant than both alternatives. On a dataset of 22k rows, it takes ages for the original formula to calculate the subtotals, while both alternatives finish in under 1 second (basically, the time needed for Excel to display filtered data).
Thank you guys, you expanded my knowledge of Excel beyond the horizon! (:
- djclementsNov 29, 2023Silver Contributor
g0ldfinch The IFERROR function is a reasonably safe method to use in this scenario, because the only plausible error that can be returned by the ROWS function is #CALC!, which is a result of the FILTER function returning an empty array. However, if you want to avoid using IFERROR altogether, use the LET function to define the include argument first, then test the results with the OR function to make sure at least one record meets the criteria. For example:
=LET( incl, BYROW([Name], LAMBDA(r, SUBTOTAL(3, r))), IF(OR(incl), ROWS(UNIQUE(FILTER([Name], incl))), 0) )Note: if the [Name] column happens to be a calculated column that also contains errors, and you don't want those errors to be included in the count of unique records, replace SUBTOTAL(3, r) with AGGREGATE(3, 7, r) to ignore hidden rows and error values.
Regarding the SUBTOTAL function also counting what seems to be "blank" cells produced by Power Query, this is not a bug per se, as the query is returning empty strings instead of null values. To solve this, you could just add another step to the query that replaces empty strings with null values in order to return truly blank cells in the query results. In the Power Query Editor, select the desired column, then go to Transform > Replace Values. Leave the "Value To Find" box blank, and type null in the "Replace With" box.
I hope that helps. Cheers!
- g0ldfinchNov 29, 2023Copper Contributor
Wow... I didn't realise that. I used to replace nulls with empty strings, but it never occurred to me that i should use the opposite transformation before loading the query data. Thanks! That makes a lot of sense. Using OR on an array also is ingenious. Once you gave that example, it struck me that OR checks values of an array containing 0 and 1, and if at least a single 1 value exists, it returns TRUE... I'm sure it does lazy computations so it must be quite efficient as well. So elegant! *Sigh* The more i learn the more i realise that i know nothing...
P.S. I added a ReplaceValue step to the PQ - it solve the issue and rid of the necessity to check values for an empty string. Thanks a lot - will keep that in mind in the future.
- g0ldfinchNov 29, 2023Copper Contributor
I discovered a side effect (a bug?) in PowerQuery generated tables. Both djclements' and SergeiBaklan's formulas count blank cells. Moreover, when i changed the formula the way shown below, mode '3' didn't get rid of blanks:
=TOCOL(MAP(names[Name],LAMBDA(v,IF(SUBTOTAL(3,v),v,NA()))),3)This works perfectly on normal tables, but those that are generated by PowerQuery have non-blank blank cells. I noticed that when i applied validation to a generated table - while configured to ignore blanks, validation marked empty cells as containing invalid values. So, to cater for that peculiarity (or MS bug), i adjusted your formulas this way:
=ROWS(UNIQUE(FILTER([APPL_COMP], BYROW([APPL_COMP], LAMBDA(r, AND(SUBTOTAL(3, r), r<>""))))) =ROWS(UNIQUE(TOCOL(MAP([APPL_COMP], LAMBDA(v, IF(AND(SUBTOTAL(3,v), v<>""), v, NA()))),2)))Not a big deal, but, maybe, it will help someone who implements the count unique values in PQ-generated tables.