Forum Discussion
Comparing Duplicates in Column A with Duplicates in Column B
Hello! I'm hoping there's a way to do this but I haven't figured it out yet so any help would be greatly appreciated.
I have two columns of data which have duplicate values in:
| Fruit | Quantity |
| Apple | 10 |
| Apple | 10 |
| Apple | 10 |
| Banana | 5 |
| Banana | 5 |
| Banana | 4 |
| Orange | 10 |
| Orange | 3 |
I'd like to be able to identify whether the duplicate rows in column A match up to the duplicate rows in column B. So, in my example, I'd like to be able to pick out "Apple" and "10" as a duplicate in both columns at the same time. As "Orange" also has a value of 10, it's not just a case of looking at the duplicates in both columns separately.
To make things more complicated, I also want to exclude "Bananas" from this as not all of its quantity values are duplicates, only two of them are. I'm unsure how best to do this.
Any help would be amazing! Thanks in advance!
I solved it using the following formula:
=AND(COUNTIFS($A$2:$A$11,A2,$B$2:$B$11,B2)>1,C2="Primary")Thanks for the help everyone!
7 Replies
- TheCharWestCopper Contributor
I solved it using the following formula:
=AND(COUNTIFS($A$2:$A$11,A2,$B$2:$B$11,B2)>1,C2="Primary")Thanks for the help everyone!
- PeterBartholomew1Silver Contributor
I returned the result because Apple is the only fruit not to feature in multiple distinct records.
= LET( FruitFreq, COUNTIFS(Fruit, Fruit), RecordFreq, COUNTIFS(Fruit, Fruit, Quantity, Quantity), UNIQUE(FILTER(Data, RecordFreq=FruitFreq)) ) - OliverScheurichGold Contributor
I understand you want to identify fruits which have only duplicates, like "Apples" and "10" in your example. Enter formula in D1 and copy down:
=COUNTIF($A$1:$A$8,A1)=COUNTIFS($A$1:$A$8,A1,$B$1:$B$8,B1)
Please compare / check solution in attached file.
- SergeiBaklanDiamond Contributor
- TheCharWestCopper Contributor
Hi SergeiBaklan and OliverScheurich - you have both given me answers which are actually really close to what I need, but I realised there's another complication to my data. I would actually like to pull out "Banana" as "True" as well, but only because its duplicate values are the highest values. If there was another "Orange - 5", I wouldn't want to include it as "Orange - 10" is higher.
Essentially, my data looks like this:
Fruit Quantity Primary
Apple 10 Primary
Apple 10 Primary
Apple 10 Primary
Banana 5 Primary
Banana 5 Primary
Banana 4 Duplicate
Orange 10 Primary
Orange 3 Duplicate
Orange 3 Duplicate
And I'd like to pick out where there is more than one Primary row per duplicate fruit. The primary record is the duplicate with the highest quantity score.- SergeiBaklanDiamond Contributor
Sorry, I didn't catch. Would you like "Banana 4 Duplicate" also shall be included? Perhaps you may add column within your file with results as they shall be in accordance to your logic.