Forum Discussion
Distinguish between words and acronyms in excel
You might eventually find it desirable to use VBA (or another scripting solution) to accomplish this, but the results you can get from a formula might be acceptable. You do not say which version of Excel you are using, so I am going to assume Excel 365 for Windows.
First, some terminology:
- "PSI" is an abbreviation (specifically, an initialism), but not an acronym. Microsoft has been a significant force in the misuse of the term "acronym" (which is an abbreviation that has become a pronounceable word, such as sonar). But there are other types of abbreviations, possibly appearing within your data, so I will refer to PSI and other units of measure and similar as initialisms.
- It appears that you are using "data string" for the unmodified word/initialism – already split out from the full description – and "block" for the modified word/initialism. But I can't be sure. Please clarify if necessary. Your mention of "35-character blocks" has no special meaning to me. Does that have any relevance (beyond mentioning the maximum length of a word/initialism)?
So yes, Excel can do a lookup (XLOOKUP, for example) to determine if a string of characters is a word or an initialism. But you will have to supply and maintain the list of words or initialisms for it to use. And I suggest you use a list of initialisms, because that will be shorter and you can be sure that some vendor will accidentally or intentionally misspell words, and possibly include their own abbreviations of words. A list of initialisms can also help standardize the entries. See the attached workbook.
Even then, this can't be perfect, because some vendor may name their product "Psi" or another initialism, and refer to that in a item's description. The description may refer to a company name that should not be proper-cased (e.g., TRW), but those company names could be added to the initialism list.
Nikolino's solution has no special relation to a language dictionary. It may conflate that meaning with the use of a COM object (a memory construct) that is named Dictionary and is similar to a Collection object. While a Dictionary (or Collection) object could potentially be useful in VBA code, his example includes very inefficient code for that object, re-creating and reloading it for every word that is being checked. And the reloading is from a worksheet range anyway, so that is likely wasted processing. About the only advantage I could see for using a Dictionary object is that it can distinguish upper-case and lower-case letters for its stored values. But I do not suspect you need that capability.