Aug 23 2024 03:35 PM - edited Aug 23 2024 03:43 PM
Hi,
I currently have a list of 80,000 patents (col. A) with their respective backward and forward citations (col. B&C).
I want to write a formula in column D that identifies a match between any of the citation numbers in column B within the corresponding row in column C. E.g. in row 3, there is a match between BWD and FWD citation (US1012) in column B and column C.
My ideal output in column D is the matching value (i.e US1012). If no matches are found between col.B & C, the corresponding cell in column D can remain blank, as shown.
My main problem (column E) is I want to have a way of identifying if the value in column D (i.e US 1012) is present in column A (list of original patent IDs). If so, I wanted an automated quick method of highlighting the corresponding row number in column A containing the matching citation in column D. The intention is to delete these matching patent IDs (i.e delete row 14 US1012).
Overall example,
Column D identifies US 1012 is present in both column B and C.
Column E highlights the row containing US1012 in column A. This methodology would be applied to a spreadsheet with 30,000 rows; I will then delete all matching citation rows (i.e row 14 with US1012 in col A).
Any help is appreciated.
Many thanks
Aug 23 2024 06:34 PM
Solution@zchahin
It could be done using FILTER function to sort out first the returned values in column D, then using the ISNUMBER and MATCH function from Column D, to identify which Patent Numbers was matched on column D.
formula for E2
array formula in E2:
=IFERROR(INDEX($D$2:$D$100, SMALL(IF($D$2:$D$100<>"", ROW($D$2:$D$100)-ROW($D$2)+1), ROW(1:1))), "")
formula in F2:
=IF(ISNUMBER(MATCH(A2, $E$2:$E$100, 0)), "DELETE", "")
Aug 25 2024 02:52 AM
Excellent - thank you very much for a simple solution to my problem.
I wanted to clarify how to adjust the formula in column D so that if there are comma separated values in column C (spreadsheet initially attached contained single patent IDs in column C), it will return in column D the ID of the matching patent (between column B and C).
E.g. in the screenshot below citation FR2845889A1 is a match between column B and C but is not displayed in cell D10 (Column D formula you have provided works great when single citation is present in column C, which we are checking for a match in column B).
Aug 25 2024 04:22 AM
=XLOOKUP(TRUE,ISNUMBER(XMATCH(TEXTSPLIT(C2,"|"),TEXTSPLIT(B2,"|"))),TEXTSPLIT(C2,"|"),"")
Does this formula return the intended result?
Aug 26 2024 09:55 AM
Works perfectly, thank you!
Only issue is my formula in column E no longer recognises the values in column D and therefore does not identify that they are present in column A and output delete in the corresponding row in corresponding column E (please see above).
Very grateful for your help so far.
Aug 26 2024 11:18 AM
You are welcome. I think the problem are the spaces for example there is " US1111" in cell D2 and "US1111" in cell A22. This doesn't return a match because of the leading space in cell D2 which isn't in A22 (in your sheet).
The reason for the entry in cell D2 is that you have e.g. "US1111 | US1234 | US1019" in columns B and C. In my example this would be "US1111|US1234|US1019" which means i have "|" instead of " | ".
With " | " in columns B and C you can use in D2:
=XLOOKUP(TRUE,ISNUMBER(XMATCH(TEXTSPLIT(C2," | "),TEXTSPLIT(B2," | "))),TEXTSPLIT(C2," | "),"")
In the attached sample file there is a worksheet for both "|" and " | ".
Aug 28 2024 11:14 AM
Aug 23 2024 06:34 PM
Solution@zchahin
It could be done using FILTER function to sort out first the returned values in column D, then using the ISNUMBER and MATCH function from Column D, to identify which Patent Numbers was matched on column D.
formula for E2
array formula in E2:
=IFERROR(INDEX($D$2:$D$100, SMALL(IF($D$2:$D$100<>"", ROW($D$2:$D$100)-ROW($D$2)+1), ROW(1:1))), "")
formula in F2:
=IF(ISNUMBER(MATCH(A2, $E$2:$E$100, 0)), "DELETE", "")