Forum Discussion
mhlester
Jul 05, 2023Copper Contributor
Lookup five characters appearing in any order in a wordlist
Hi, all,
I have a wordlist and want to write a formula that will find every word that contains five alphabet characters in any order.
Here is my formula. It indicates what the first letter must be (I1) and what letters must be in the word (anywhere from three to five characters). My formula only searches for words in the wordlis that contain the characters appearing in the same order as in the formula. If I search for w-o-n-d-e, it will find wonderful, but it won't find endow.
Column E is a string containing the letters to search for. Columns R - V show each character in the string in a separate cell.
=IF(C1>=9,VLOOKUP(I1&"*"&LEFT(E1,1)&"*"&MID(E1,2,1)&"*"&RIGHT(E1,1)&"*",$A$2:$F$1200,1,0),IF(C1=8,VLOOKUP(I1&"*"&S1&"*"&R1&"*"&U1&"*"&T1&"*",$A$2:$F$1200,1,0),VLOOKUP(I1&"*"&LEFT(E1,1)&"*"&MID(E1,2,1)&"*"&MID(E1,3,1)&"*"&MID(E1,4,1)&"*"&RIGHT(E1,1)&"*",$A$2:$F$1200,1,0)))
Can someone show me a formula that will work?
Also, even if the order of the characters happens to be right, it will only find the first instance of a word that contains all the characters. I would like to have a list of all the words that contain all the characters.
Thanks,
Michael
- PeterBartholomew1Silver Contributor
- mhlesterCopper Contributor
I was finally able to make Filter work to find all possible matches for all legal words with this formula.
=IFERROR(TRANSPOSE(FILTER($A$2:$A$3570,ISNUMBER(SEARCH(AA451,$F$2:$F$3570)))),"")
The word in column AH in the sheet snippet below is the first word to solve the puzzle and the words in columns AI, AJ, et. seq. are candidates for the second word.
I could not figure out how to make the formula return only words that start with the last letter of AH or end with the first letter of AH. Only the words highlighted in yellow in cells AI, AJ, et. seq., in the sheet below (a small snippet of the full list) solve the problem.
Will someone please help me revise the formula to only find legal words (in AI, AJ, et. seq.) that start with the last letter of the first word (AH) or end with the first letter of the first word (AH)?
Addendum:
=FILTER($AK$451:$JL$451,RIGHT($AK$451:$JL$451,LEN(AI451))=AI451,”No Results!”)
This filter would filter the results of the above filter to find only the words that end with the first letter of the first word. So, I am part way there. I just need to add the filter that also finds any words that start with the last letter of the first word. Maybe these filters could be nested? That would be it, I think.
- PeterBartholomew1Silver Contributor
The formula used in the solution I posted above was
... firstWord, TOCOL(IF(allChar=maxChar, legalList, NA()), 3), secondWord, TOCOL(IF(allChar=maxChar, TRANSPOSE(legalList), NA()), 3), FILTER(HSTACK(firstWord, secondWord), RIGHT(firstWord, 1) = LEFT(secondWord, 1))
The next task would be to optimise the formula to improve performance.
For example, I checked every character within a Scrabble word to see whether it was drawn from the Square. Had I followed your suggestion and checked for the occurrence of other characters, then recursion would allow one to bail out before the whole word is validated.
- mhlesterCopper Contributor
Don't blame you! I can't let it go, yet. I want to find all the instances of matching pairs that are in my dictionary...
- PeterBartholomew1Silver Contributor
You say you are using 365, in which case formulae such as
= LET( characters, TEXTSPLIT(letters,"-"), hasChr?, ISNUMBER(FIND(characters, wordlist#)), wordMatch?, BYROW(hasChr?, LAMBDA(x, AND(x))), FILTER(wordlist#, wordMatch?) )
might provide a good starting point.
- mhlesterCopper ContributorHi, Peter,
Thank you for this. I don't how to implement. Would you mind explaining a bit more how this would work with the sample spreadsheet I sent in reply to Patrick? In the meantime, I will do some research on the functions that are unfamiliar to me.- PeterBartholomew1Silver Contributor
Sorry, this task is going to take too long. So far all I have done is filter the original scabble list to return only words comprising letters taken from the square. The adjacent letters rule is not as straightforward but most likely your excluded character pairs approach will prove effective.
The new Excel functions LET and LAMBDA enable the development of array solutions that do not look remotely similar to any past spreadsheet practice. LET introduces local variables scoped to the formula. LAMBDA builds on that to allow values to be passed to such variables by using an argument list. The result is to allow a spreadsheet formula to be laid out like a program module, calling other Lambda functions where required.
Could you attach a small sample workbook demonstrating the problem (without sensitive data), or if that is not possible, make it available through OneDrive, Google Drive, Dropbox or similar?
- mhlesterCopper ContributorHi, Hans,
Thank you for your reply. Here is a link to the complete dropbox file.
https://www.dropbox.com/scl/fi/otl468c3tqoap0cnetd66/Scrabble-Dicitionary-for-Letter-Boxed-230705-2.xlsx?rlkey=mazzcaauxulfrxz1ony9g9fku&dl=0
The spreadsheet is intended to solve the New York Times Letter Boxed puzzle, which puts 12 unique alphabet characters around a square, three to a side. You must use all twelve letters in as few words as possible. No two characters from the same side of the square can be adjacent in any of the words. (I only want to solve for two words even though the puzzle lets you solve in more than two words.) The second word must start with the last letter of the first word.
Column H is a solver but it can only solve one row at a time and I have to change the formula to reflect the row number each time. Too much work! I would like to automate that.
Columns I through L are a second solver that could work to find the first legal word for each word in the list if the lookup criteria didn't have to appear in the same order as in the second word of the pair.
Column A = wordlist
Column B = words from the wordlist that contain one or more of the twelve letters
Column C = number of unique letters in the word in each row
Column D = filled with the twelve letters from the day's puzzle
Column E = letters that are missing from the words in each row
Column F = the unique letters in each row
Column G = repeat of the wordlist (can delete)
Thanks, Michael- Patrick2788Silver ContributorYou may want to include a stripped down workbook with less formulas. All that's really needed is a word list and some examples of the desired return (even if done manually). The workbook you linked above has a ton of resource-heavy formulas that make using the workbook very difficult.