Forum Discussion
Searching column cells for similar text values and summarising in another column
Hi Experts
I hope you can help here. To be honest without Macros not sure this is possible with formulas?
I attach an example workbook with my desired results from the data.
Basically we have a support ticket system for users and can export the "summary" or "titles" for each ticket. We want to have a formula to analyse the column with all the ticket titles and produce a list of common similar themed topics, and the number of deemed occurrences.
Now I know this can be done either looking for an exact text word match but the difficulty comes where I am hoping excel can attempt to categorise similar patterns of words / phrases / text.
Any solution via formula or am I expecting too much?
Thank you for your help!
so I have a couple of thoughts. First off the example was very simple in the sense that each output was basically based on a key word. So you could use:
=COUNTIFS($B$2:$B$10,"*fresh*")
to count all items with "fresh" somewhere in the topic. you could create a column next to your output table with a list of keywords so then you could use:
=COUNTIFS($B$2:$B$10,"*"&D2&"*")
where D2 is the keyword "fresh" or "freshness" or whatever you want
you can even make it an array output using:
=COUNTIFS($B$2:$B$10,"*"&D2:D4&"*")
you could get more fancy and have a list of possible words and use something like:
=LET(list,TEXTSPLIT(C2," "),SUM(--(MMULT(--ISNUMBER(SEARCH(list,$B$2:$B$10)),SEQUENCE(COLUMNS(list),,1,0))>0)))
where this will break each word in C2 up and check if it exists in each row of the inputs and then OR all those together to find how many rows have at least 1 of those key words. In your example this procudes a wrong result for the 3rd row as "issues" is found in another line.
there are MANY other options and roads you could go down with excel functions and could simulate your macro solution very closely if not the same.
That all said another option is to add the Excel Labs plug-in and use the Labs GenerativeAI plugin and let an AI service help determine the results:
=LABS.GENERATIVEAI("does """&B2&""" have to do with """&H1&"""?")
this requires you to have an AI account and give the plug-in an access key
- matt0020190Brass Contributor
Thanks very much for this very helpful reply.
I ended up using the powerful LET function example above that did the trick.
Only thing that could be better is I like how the terms are searched within each cell from the list. However if one of my search terms in the list is for example "topic", all words in the cell with this sequence are still flagged e.g "topical" or "atopic".
Is there a way to ensure the searching from the list is only for exact matches?
- m_tarlerSteel Contributor
First off I'm glad this is helpful. As for a tweak to make it match words exactly might be:
=LET(list,TEXTSPLIT(C2," "),SUM(--(MMULT(--ISNUMBER(SEARCH(" "&list&" "," "&$B$2:$B$10&" ")),SEQUENCE(COLUMNS(list),,1,0))>0)))
but of course then "topic" will NOT find "topics" or "topic-1" or "1)topic"
fyi what i did was add a " " before and after each search term AND before and after each cell so if the cell has "topic" as the beginning, end or only word it will still be found.
- matt0020190Brass Contributor
Thanks again, I thought it worked at first but the above does produce some strange results. For example if I search the column manually, I sometimes find less than what this formula does. Maybe something to do with my list?
I have one cell with search terms a few examples:
store splits, regional splits
security, guarding, guards, SG
FF4, Fit for 4, FFT, Fit for trade
With this new formula, is it searching exactly for "store splits" or "store" and "splits" separately?
A little confused as I want every term separated by commas searched exactly.
If I remove the space between the terms ie. "store splits,regional splits" I get less results so unsure what is expected if you can clarify?
- m_tarlerSteel Contributor
so I have a couple of thoughts. First off the example was very simple in the sense that each output was basically based on a key word. So you could use:
=COUNTIFS($B$2:$B$10,"*fresh*")
to count all items with "fresh" somewhere in the topic. you could create a column next to your output table with a list of keywords so then you could use:
=COUNTIFS($B$2:$B$10,"*"&D2&"*")
where D2 is the keyword "fresh" or "freshness" or whatever you want
you can even make it an array output using:
=COUNTIFS($B$2:$B$10,"*"&D2:D4&"*")
you could get more fancy and have a list of possible words and use something like:
=LET(list,TEXTSPLIT(C2," "),SUM(--(MMULT(--ISNUMBER(SEARCH(list,$B$2:$B$10)),SEQUENCE(COLUMNS(list),,1,0))>0)))
where this will break each word in C2 up and check if it exists in each row of the inputs and then OR all those together to find how many rows have at least 1 of those key words. In your example this procudes a wrong result for the 3rd row as "issues" is found in another line.
there are MANY other options and roads you could go down with excel functions and could simulate your macro solution very closely if not the same.
That all said another option is to add the Excel Labs plug-in and use the Labs GenerativeAI plugin and let an AI service help determine the results:
=LABS.GENERATIVEAI("does """&B2&""" have to do with """&H1&"""?")
this requires you to have an AI account and give the plug-in an access key