11-30-2018 03:58 AM
11-30-2018 03:58 AM
OK ... this problem has me completely stumped. Background: I have a report that is created in two steps. The first is a data extract out of the company's PLM system, which generates a simple .xlsx file based on the end users' parameter selections. Nothing fancy. No calculations. Just a data dump. This data extract is then copied into a table on an Excel template, which ultimately creates graphs and charts. The table contains a calculated column. Sounds simple enough.
The primary function of this report is to give the end user an overall view of the number of products in the system and where they are in the product lifecycle (proposed, approved, etc). The user has the option to drilldown to see what steps are being worked on and by whom. Since a product can be listed multiple times on the extract (if it's currently being worked on by multiple groups), the calculated column was created to count the unique number of products.
The calculated column works like a charm (thanks to https://exceljet.net/formula/flag-first-duplicate-in-a-list). However ...
When you run another extract, and load the data into the table, the calculated column in the first (and sometimes second) row is incorrect. But from rows 3 onwards, the formula is correct. It's so wacky.
On the attached, there are two tabs. The first tab is a chopped down example of the data extract (columns A through F). The second tab is the table (Table3) with the calculated column in Column G.
The formula is: =IF(COUNTIF($B$2:$B$20,B2)>1,IF(COUNTIF($B$2:B2,B2)=1,1,0),1)
This looks at the Product Spec ID and counts 1 for every unique ID. Obviously, the first COUNTIF range will fluctuate depending on the number of items in the extract. For this example, it's 20.
The values in Column G are accurate.
Now, delete all the detail rows from Table3 and copy and paste the detail rows and columns from the Data extract tab (A2:F20) to Table3. This is the same data that was in the table before. Look at the formula in cells G2 and G3.
The second COUNTIF in the formula has changed, and the change is significant because the formula no longer accurately counts the first row.
Before (2nd countif): IF(COUNTIF($B$2:B2,B2)=1
After (2nd countif): IF(COUNTIF($B$2:B20,B2)=1 <---B20 should be B2
And it does the same thing to the second row:
Before (2nd countif): IF(COUNTIF($B$2:B3,B3)=1
After (2nd countif): IF(COUNTIF($B$2:B20,B3)=1 <---B20 should be B3
But subsequent rows are fine! Aarrrgh!
Anyone have any idea why Excel would change the first two calculated column formulas and what in the world I can do to preserve the original formula?
11-30-2018 05:19 AM
I think you have deleted the entire rows of the Table3 and leave only the first row of it with the formula in cell G2 like this:
I think that's happened because of the effect of the column automatically filling.
Please note that the Table3 is an Excel table, not a normal range so the formulas you use in the Excel table will automatically populate to fill the column.
The formula in cell G2 was existing before copying the rows again and the second COUNTIF has an expandable range ($B$2:B2).
When you copy the rows again into the Table3, this range is expanded to end up with last row number of the table because it was existing in the last row in the Table3 before you copy the rows again.
The formulas in other rows are correct because their rows are created again, and they are was calculated in their rows after the filling is finished.
This behavior is really weird.
I don't know if we can consider it a bug!
The solution to this issue is to use this formula in cell G2
Instead of the fixed row number, you can use the ROW() function to always give the second COUNTIF the row number of the current row.
This is will prevent this behavior from happening.
I hope that makes sense
11-30-2018 05:46 AMSolution
If you delete rows from the table, let say keeping only second one, in G in formula it will be $B$2:B2. Second row it the last in the table and it's internally recognize B2 as last cell in the table in column B. When you add more rows B2 (note, that's relative reference), B2 will be change on reference in last cell in column B, in that case on B20.
In general with tables better to use structured references to avoid side effects like this. In you case formula could be
=IF(COUNTIF([Product Spec ID / Version],[@[Product Spec ID / Version]])>1,IF(COUNTIF(OFFSET(Table3[[#Headers],[Product Spec ID / Version]],1,0,ROW()-ROW(Table3[[#Headers],[USC]])),[@[Product Spec ID / Version]])=1,1,0),1)
11-30-2018 06:23 PM
Thank you kindly for your explanation and suggested formula correction, Haytham! This community of tech experts is awesome.
11-30-2018 06:29 PM
Thank you so much for your response, recommendation, and lesson-of-the-day, Sergei. I took your advice and amended the formula to use structured references. All is beautiful in my Excel world again. Love you guys (and gals)!
11-30-2018 07:15 PM
Thank you very much :)
I don't think that the side effect is avoided because of the structured references as we can convert them to normal ranges so we get the same result:
That's because of the OFFSET function along with the ROW().
However, this is really a good approach as the INDIRECT function may slow down the workbook if it's applied to a large number of rows.
12-01-2018 05:03 AM
Structured references naturally push to use OFFSET for the reference on another rows. If we use mix of structured references and absolute/relative ones - yes, we avoid nothing.