Forum Discussion
finding several words in a column and avoiding duplicates
Hi everyone
I have a very long question, so please bear with me as i try to explain my issue. I have list of several different types of analysis:
Moisture/Loss on drying |
Ash/Residue on ignition/sulphated ash/sediment |
Protein |
Fat |
Hg/Mercury/Kviksølv |
Pb/Lead/Bly |
As/Arsenic/Arsen |
Cd/Cadmium |
Fl/Fluoride/Fluor |
Assay/content/composition/Name of compound/purity |
They can all be called by several different names indicated by the slash. I have converted this so i have a list for each of these entries.
As/Arsenic/Arsen |
arsen* |
arsenic* |
As* |
So I have something similar for every single analysis. Now I need to go into another table in another sheet and figure out if this table contains this analysis. Here is my data table:
Parameters |
Appearance |
Particle Size(mesh) |
Whiteness |
Viscosity (cP,1% KCl) |
Shear ratio |
V1/V2 |
pH (1% solution) |
Loss on Drying (%) |
Ashes (%) |
Pb(ppm) |
Total Nitrogen(%) |
Pycivic Acid(%) |
Total Plate Count(CFU/g) |
Moulds/Yeasts(CFU/g) |
Coliform (MPN/g) |
Salmonella |
E.Coli |
Assay (%) |
Xanthomonas campestris |
Ethanol/ propan-2-ol ppm |
Finally I would like to get a table like this where the number is the row number. The trick of course being that no two numbers can be the same - because no two rows contain the same analysis.
Moisture/Loss on drying | 10 |
Ash/Residue on ignition/sulphated ash/sediment | 11 |
Hg/Mercury/Kviksølv | #NUM! |
Pb/Lead/Bly | 12 |
As/Arsenic/Arsen | 11 |
Cd/Cadmium | #N/A |
Assay/content/composition/Name of compound/purity | 20 |
Ethylen Oxide | #NUM! |
And of course i'm also having issues with As being recognized in several rows without the analysis actually being present.
=SMALL(IF(ISNUMBER(XMATCH(INDEX(Sheet5!$1:$50,2,F2):INDEX(Sheet5!$1:$50,G2,F2),'Table001 (Page 1)'!A:A,2)),XMATCH(INDEX(Sheet5!$1:$50,2,F2):INDEX(Sheet5!$1:$50,G2,F2),'Table001 (Page 1)'!A:A,2),""),1)
I have used this very long xmatch formula where sheet5 contains the list of words for each analysis and table001 is the table i need to search. The index functions are simply to make it more dynamic. I need it all to be as dynamic as possible which sadly does not make it any easier.
I hope some of you will be able to help me as I am completely stuck. I have tried several different ways, but no matter what I do there is some result that comes out wrong.
Thank you for taking the time to read this.
Vnibsen89
Perhaps this one:
=LET( FindTerms, LAMBDA(row, LET(words, TEXTSPLIT(row, "/"), SUM(IFNA(XMATCH("*" & words & "*", ListofTerms, 2), 0))) ), BYROW(terms, FindTerms) )
There's still the issue of using a wildcard match with terms with few characters such as "as" where we get a false positive where it matches "ashes".
5 Replies
- Patrick2788Silver Contributor
Perhaps this one:
=LET( FindTerms, LAMBDA(row, LET(words, TEXTSPLIT(row, "/"), SUM(IFNA(XMATCH("*" & words & "*", ListofTerms, 2), 0))) ), BYROW(terms, FindTerms) )
There's still the issue of using a wildcard match with terms with few characters such as "as" where we get a false positive where it matches "ashes".
- Vnibsen89Copper Contributor
Thank you for this! It saved me an entire worksheet and it works great. But as you said it still has the issues with As being found in ashes. Is it possible to maybe add a list of words it cannot match inside this formula? Because that is something I can easily make.
Thank you again
Vnibsen89
- Patrick2788Silver Contributor
One solution could be to discard terms with a LEN of less than 3 so there are no false matches. For example:
=LET( FindTerms, LAMBDA(row, LET( prelim, TEXTSPLIT(row, "/"), words, FILTER(prelim, LEN(prelim) > 2), SUM(IFNA(XMATCH("*" & words & "*", ListofTerms, 2), 0)) ) ), BYROW(terms, FindTerms) )