Apr 18 2021 01:17 AM
Apr 18 2021 01:17 AM
Hi, I need help creating the following sorting logic. I have 3 columns, column 1 has a unique identifier value (ex. 15-1, 15-2,15-3), column 2 has a group value that is shared amongst members of that group (ex. Group 1, group 2, group 3), and column 3 has one of the following three choices (A, B, C).
How do I create a script or formula that identifies entries that have the same group number (column 2) but not the same value in column 3?
Theoretically, all entries with the same group number should also have the same value in column 3. I want to quality control and check to make sure that this is the case. My data can potentially be very voluminous with thousands of different group IDs. What would be the best way to check for these inconsistencies? I will likely have too many different values in column 2 to make graphs and charts useful. I need a solution to help me easily identify all group values with inconsistent values in column 3.
I’m not very advanced with excel and need some guidance. Studying the documentation and Youtube has not provided me with any answers thus far. Any recommendations and advice would be greatly appreciated.
Apr 18 2021 01:29 AM
Why do you have column 3 at all? Of the choice of A/B/C depends on the group, you don't need to store it in every row. Just create a separate table:
If you wish, you can use VLOOKUP formulas in the third column to return the choice corresponding to the group number. That way, it will be consistent.
Apr 18 2021 01:42 AM
The data is not being created in excel, it is being imported into excel and there will inevitably be examples of entries with the same group number but different choice values due to human error. The goal is to find and identify those inconsistencies in excel. For example, in the below chart, I would want to be able to identify entries 15-3 and 15-8 that have incorrect choice values that do not match the choice associated with their fellow group members.
Apr 18 2021 02:33 AM
OK. Let's say your data are in columns A, B and C.
Enter the following array formula in D2, confirmed with Ctrl+Shift+Enter:
Adjust the ranges if your data have more than 100 rows. Then fill down.
Column D will contain TRUE if the value in column C is different from the most frequently occurring choice for the same group. This assumes, of course, that the most occurring choice is the "correct" one, and that there is a most frequently occurring choice.
Apr 18 2021 08:31 AM
The first thing I thought of was the reciprocal count trick to calculate the distinct count for each group.
= LET( m,(Group=@Group)/COUNTIFS(Group,Group,Choice,Choice), SUM(m) - 1 )
Then I decided I liked your solution but took it a step further to propose the modal solution as a correction
= LET( indices, IF(Group=@Group, XMATCH(Choice,Choice)), mode, INDEX(Choice,MODE(indices)), IF(@Choice<>mode, mode, ""))
Apr 18 2021 11:12 AM
@Peter Bartholomew Excel won't run the formula you suggested as written, it updates it to
=@LET(indices, IF(@Group=@Group, XMATCH(@Choice,Choice)), mode, INDEX(Choice,MODE(indices)), IF(@Choice<>@mode, mode, ""))
I am unable to get any results, and receive a #name? error. Is there any customization I have to do with the formula in order for it to run properly? I am using Excel version 2103
Apr 18 2021 01:04 PM
Sorry, but the versions of Excel are not compatible
= SUM((Group=@Group)/COUNTIFS(Group,Group,Choice,Choice)) - 1 = IF(INDEX(Choice, MODE(IF(Group=@Group, MATCH(Choice,Choice,0))))<>@Choice, INDEX(Choice, MODE(IF(Group=@Group, MATCH(Choice,Choice,0)))), "")
Whilst Excel 365 can run legacy spreadsheets, Office 2013 will accept neither the XMATCH function nor, more seriously, the LET function that gives modern Excel its code-like appearance. The formulas as shown should be compatible.